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Introduction 

Vision systems, either artificial or biological, are confronted with the 
problem of inferring geometrical and physical properties of surfaces around 
the viewer. The available data -the images - consist of two dimensional ma¬ 
trices of light intensity values measured by an eye or a camera. For tasks such 
as navigation, manipulation and visual recognition, vision systems have to 
recover 3D properties of surfaces from the 2D images. Typical 3D properties 
are the distance between the surfaces and the viewer, their orientation, struc¬ 
ture, teiture, reflectance and motion parameters (from a temporal sequence 
of images). 

The visual skills that provide us with this kind of information have been 
explored in animals and humans with physiological and behavioural tech¬ 
niques. With the recent development of computer vision, these problems 
have been formulated rigorously and given by now familiar names, such as 
structure from stereo , structure from motion . structure from texture , shape 
from shading, edge detection , visual interpolation , computation of optical 
flow. The computational modules that solve them constitute together the 
core of early vision , and provide spatial and geometrical information about 
the 3D world. The results of this first stage of processing are then used for 
higher level tasks such as navigation in the environment, manipulation of 
objects and of course object recognition and also reasoning about objects. 
Unlike high level vision, early vision is mostly considered as a bottom-up set 
of processes that do not rely upon specific high-level information about the 
scene to be analysed. It is commonly argued on the basis of computational 
and psychophysical considerations that these different modules of early vision 
can be analysed independently of each other, at least to a first approxima¬ 
tion. Their most natural implementation is in terms of distinct pieces of 
hardware, whose outputs will be integrated at a later stage, possibly using 
more intelligent procedures (another possibility is to use coupled IVlarkov 
Random Field models for the integration stage, see [68]). 

Even a superficial analysis of these problems reveals their common in¬ 
verse nature: they can be regarded as inverse optics since they attempt to 
recover 3D properties of surfaces from the 2D images they generate. This 
observation characterizes the field of early vision as the solution of problems 
of inverse optics^ 1 '. 

It is well known that inverse problems are very often ill-posedf 2 i’ i3 i in 
the original sense of HadamardW; that is, the solution may not exist or it is 
not unique or does not depend continuously on data. These problems can 
be formulated as discrete or continuous problems according to the type of 
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the available data. In order to solve ill-posed problems a priori information 
about generic properties of the solution must be used. Regularization theory 
is a set of techniques that have been developed for this reason. Standard 
regularization exploits a priori knowledge by restricting the functional space 
t° which the ‘‘solution ’ belongs: the specific techniques use generalized in¬ 
verses or variational formulations. An alternative possibility, that we call 
stochastic regularization, is based on a Bayesian approach and optimal esti¬ 
mation. All these reasons we introduced recently regularization techniques 
into computer vision. B. Horn (see [G9j for a comprehensive review of his 
work) had approached several problems in vision from a similar point of 
view, using minimization techniques for their solution, without an explicit 
connection with regularization techniques. 

The goal of this paper is to review the mathematical aspects of this 
approach (for a less rigorous discussion, see [lj). It is organized in two 
parts. Part. One reviews the main mathematical results that characterize the 
difference between well-posed and ill-posed problems (Section 2) and between 
well-conditioned and ill-conditioned problems (Section 4). The notions of 
generalized inverses (Section 3) and of regularization methods (Section 5) 
are then introduced. Section 6 contains some results related to inverse non¬ 
linear problems. Section 7 illustrates stochastic regularization. 

Part Two shows that several variational principles recently introduced 
in early vision can be formulated as regularized solutions to ill-posed inverse 
problems. Four problems in early vision are studied in detail: edge detection 
and numerical differentiation (Section 8), optical flow (Section 9), surface 
interpolation (Section 10), and shape from shading (Section 11). 
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Part One 


1. Outline 


In this part, of the paper we review some of the methods which have been 
developed for the approximate solution of ill-posed problems. The linear 
case is discussed in detail since a well-developed theory is available. We also 
make some comments on non-linear problems. 

In Section 2 we define the class of well-posed problems, stressing that 
a well-posed problem is not necessarily robust against noise. A well-posed 
problem, in order to have solutions that are robust against noise, must also 
be well-conditioned (see Section 4). For ill-posed, linear, inverse problems, 
well-posedness can be restored by generalized solutions if the range of the 
operator (which has to be inverted) is closed (see Section 3). When the 
range of the operator is not closed, or when the problem is seriously ill- 
conditioned, regularization techniques have to be used (Section 5) in order 
to avoid the instability of the solution against noise. Therefore, since images 
are intrinsically noisy, these techniques represent the ideal tool for early 
vision problems. Some results on inverse nonlinear problems are presented 
in Section 6. A stochastic approach to inverse problems is introduced in 
Section 7 and its connections with standard regularization are discussed. 


2. Well-Posed and Ill-posed Problems 

Hadamard 1 ^ ^ ^ defined a mathematical problem to be well-posed, when* 

(a) for each data g in a given class of functions Y there exists a solution 
u in a prescribed class X ( existence ); 

(b) the solution u is unique in X ( uniqueness ); 

(c) the dependence ot a. upon g is continuous, i.e., when the error on 
the data g tends to zero, the induced error on the solution u tends also to 
zero ( continuity ). 

The requirement of continuity is related to the requirement of stability 
or robustness of the solution (see, for instance, [6]). Continuity, however, is 
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a necessary but not sufficient condition for stability. A well-posed problem 
can be ill-conditioned (see Section 4). 

All the classical problems of mathematical physics, such as the Dirichlet 
problem for elliptic equations, the forward problem for the heat equation, and 
the Cauchy problem for hyperbolic equations, are well-posed in the sense of 
Hadamard. Also, the “direct problem in scattering (or imaging) theory, 
namely the computation of the scattered radiation (image) from a known 
constitution of the sources and of the targets, is well-posed. 

“Inverse’’ problems usually are not. well-posed. In most cases an “in¬ 
verse" problem can be obtained from the “direct” one by exchanging the 
role of solution and data. For instance, in the case of scattering theory, 
the inverse problem consists in the computation of the characteristics of the 
targets from the knowledge of the sources and of the scattered radiation. 

Consider a very simple example taken from classical optics. If the energy 
distribution u is given in the object plane of an optical instrument and if the 
characteristics of the instruments are known, it is possible to compute, by 
solving the wave equation, the energy distribution g in the image plane. In 
the case of Fourier optics, one finds a linear relation between u and g: 


g{x) = J A'(.r, y\u(y)dy, (2.1) 

the kernel K{x,y) being the impulse response (point spread function) of the 
instrument. The direct problem (the computation of g given u) is clearly 
well-posed. The inverse problem (the computation of u given g) usually is 
not. 


Assume that K(x,y) = K(x-y) % where A'(«r) is a band-limited function. 
Then there exist functions a which produce a zero image (think of a function 
which has only Fourier components out of the band of the instrument ) and 
therefore uniqueness does not hold. Furthermore, if g{x) is the result of 
an experiment and therefore is affected by noise, it is not necessarily band- 
limited or it can have a band broader than that of the instrument. Under 
these circumstances the solution of Equation (2.1) does not exist. 


The need to investigate problems which are not well-posed but are of 
interest in applied science originated two interesting branches of mathemat¬ 
ical analysis: the first is the theory of generalized inverses which 

is an extension of the theory of the Moore-Penrose inverse of a matrix; 
the second is the regularization theory of ill-posed (or improperly posed) 
problems^These days, the term ill-posed is used generally 
(but not only) for those problems which do not satisfy the requirement of 
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continuity. Examples of ill-posed problems are analytic continuation, the 
Cauchy problem for elliptic equations, backsolving the heat equations, su- 
perresolution, computer tomography, Fredholm integral equations of the first 
kind, and as we will see, many problems in early vision. 
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3. Generalized Inverses 


Most linear inverse problems can be formulated as follows: assume that 
functional spaces A, 1 (for instance, Hilbert spaces) are given and that a 
linear, continuous operator L from X into 1 is also given; then the problem 
is to find, for some prescribed g E 1', a function u E X such that 

9 = La. (3.1) 

In this formulation, the direct problem is just the computation of g 1 given 
u. Therefore, continuity of L is equivalent to well-posedness of the direct 
Problem. Notice that Equation (2.1) is a special case of Equation (3.1). 

The problem (3.1) is well-posed if and only if the operator L is injective 
(i.e., the equation La — 0 has only the trivial solution u = 0 (uniqueness)), 
and it is onto T (existence). Then general theorems of functional analysis 
(tor instance, the “closed graph theorem”) assure that the inverse mapping 
L~ l is also continuous (continuity). 

Assume now that the equation Lu — 0 has nontrivial solutions. The set 
of these solutions is a closed subspace of JV, which is called the null space 
J\ (L) of L. This is the subspace of the invisible objects”, since they produce 
a zero image g. Assume also that the range R(L) of T, namely the set of the 
g which are images of some u E AT, is a closed subspace of 1\ An example is 
provided by the integral operator corresponding to the perfect low pass filter 

i r \i \ f + °° smd(x — y) 

( Lu)(x ) - / ———— u{y)dy. (3.2) 

J ~ oo 7r( <c y ) 

In such a case, if we take A — 1 — L 2 (— oo, Too), the null space is 
the set of all the functions u whose Fourier transform is zero on the band 
[-Q,H], while the range of L is the set of the band-limited functions with 
bandwidth fi, which is a closed subspace of L 2 (-oo,+oc). Notice that L is 
a projection operator, the so-called band-limiting operator. 

A wa y of restoring existence and uniqueness of the solution under the 
conditions above is to redefine both the solution space X and the data space 
1 . AVe take a new space X' which is the set of all the functions orthogonal 
to A(L) (in the case of (3.2), A' is the space of the square integrable fb 
bandlimited functions), and we take R{L ) as the new data space V (in the 
case (3.2) again, the space of the square integrable Sbbandlimited functions). 
Then for any g E Y' there exists a unique u E A"' such that g — Lu , (in the 
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case of (3.2) the solution is trivial: u — g) and therefore the new problem is 
well-posed. 

The redefinition oi the spaces X,Y outlined above usually is quite diffi¬ 
cult (almost impossible) in practical problems. Therefore, it is useful to have 
a method, based on the solution of variational problems, which produces the 
same result. This is just the method of generalized inverses 

3.1. Least squares solutions or pseudosolutions 


Consider first the case in which L is injective but not onto (i.e., the exis¬ 
tence condition is not satisfied). The set of functions u e X that solve the 
variational problem 

II Lu — #|| r = minimum, ( 3 . 3 ) 

where |( |y denotes the norm of Y , are called the least squares solutions (or 
pseudosolutions) of Problem (3.1). These solutions can be easilv obtained 
considering the first variation of the functional ( 3 . 3 ), 

2 Re{Lu - g, Lh)Y, ( 3 . 4 ) 

where h is an arbitrary function of X and (-,-)T the inner product of the 
Hilbert space T . Setting (3.4) equal to zero, we obtain the Euler equation 

VLu = L*g, ( 3 . 5 ) 

where L* is the adjoint of the operator L (L* is a mapping from Y into A"). 
When R{L) is closed, Equation (3.5) always has solutions but the solution 
is not unique when N{L) is nontrivial. Notice that the set of solutions of 
Equation (3.5) coincides with the set of solutions of the equation 

Lu = p 9, (3.6) 

where P is the projection onto R{L). Therefore, solving Equation ( 3 . 5 ) 

is equivalent to taking Y' = R{L) or to projecting g onto Y'. When the 
operator L is injective, the solution of (3.5) is unique and well-posedness has 
been restored. 


3.2. Normal pseudosolutions or generalized solutions 

Consider now the case in which L is not injective (i.e., the uniqueness condi¬ 
tion is not satisfied and the problem is underconstrained). Then, one looks 
for the solution of (3.5) which has minimal norm 


!| u ,|| x = minimum. 


(3.7) 
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This solution is unique and is denoted by u + . u + is usually called the gener¬ 
alized solution (or normal pseudosolution) of Problem (3.1). u + is orthogonal 
to N(L) and therefore this procedure is equivalent to taking A'' = A r (I) x . 

Since there exists a unique u + for any g G Y, a linear mapping from 
i into X is defined by 

= L+ 9- (3.8) 

The operator L + is the generalized inverse of L and it is continu¬ 
ous. Therefore, the problem of computing the generalized solution of Equa¬ 
tion (3.1) is well-posed if and only if R(L) is closed. The essential reason for 
this result is that in this case the space h can be decomposed as 

Y = R(L)®R ± (L), (3 .9) 

where © means direct sum and i? x (X) is the orthogonal complement of R(L). 
This decomposition can be made if and only if R(L) is closed. 

3.3. C-generalized solutions 


In several inverse problems, the generalized solution is trivial or does not sat¬ 
isfy some physical requirements such as smoothness. Examples are provided 
in Section 9. Then an extension of the generalized solution goes as follows: 
let p( u) be a norm or a seminorm on X (see Appendix A for the definition) 
of the following style: 

p(u) = \\Cu\\ z (3.10) 

where C is a linear operator from X into the Hilbert space Z (the constraint 
space). The operator C may not be defined everywhere on AT For instance, 
suppose X is a space of square-integrable functions and C is a differential 
operator. Therefore, in general, p(u) is defined on a subset of A, i.e. the 
domain of C 1 denoted as D(C). When the null space of C is trivial (contains 
only the null element of A ), then p{u) is a norm on D{C ); otherwise, p(u) 
is a seminorm. 

If there exists a unique least-squares solution which minimizes p(u) 
we denote it by u c and we call it a C-generalized solution. The mapping 
9 11 c defines a linear operator from Y into X, which will be called 

the C-generalized inverse of L. It is obvious that u ^ can have a nonzero 
component onto N(L), the subspace of the “objects” which are “invisible” 
under the action of the operator L. Therefore, this procedure is physically 
plausible only when the constraint describes some physical property of the 
solution of the problem. 
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Necessary and sufficient conditions for the existence of u £ for any g have 
been given in the case where R(L) is closed and C is a bounded operator 
with R(C ) also closed^. However, the assumption of a bounded constraint 
operator C may not cover the interesting case of a differential operator. 
Furthermore, when D{C) is a subset of AT, it is obvious that does not 
exist for any g 6 Y. If we denote by LD{C) the set of all the functions g F Y 
such that g — Lu with u F D(C ), then LD(C ), in general, does not coincide 
with R{L). Under these circumstances, if Pg £ LD(C ), the intersection 
between the set of the least squares solutions and D(C) is empty and u^ 
does not exist.. In other words, the problem of determining the C-generalized 
solution may be ill-posed even when R(L) is closed. 

Sufficient conditions which assure the existence of for any g such 
that Pg 6 LD(C) are (see Appendix B): 

(i) The intersection of N(L) and N{C) contains only the null element of 
-V, i.e., the set of equations 

La = 0, Cu = 0 (3.11) 

has only the common trivial solution u = 0 (uniqueness condition); 

(ii) The operator C : X ^ Z is closed with D[C) dense in X and R(C) = Z\ 

(iii) The set of functions u such that g = Lu and Cu = 0, i.e., the set LN{C ), 
is closed in T. 

The third condition is always satisfied in the case of seminorms defined 
in terms of differential operators because in that case N{C) is a finite di¬ 
mensional subspace of X and L is a continuous operator. 

When the constraint operator C satisfies conditions (i) - (iii) and fur¬ 
thermore is bounded, exists for any g F Y and the C-generalized inverse 
Zj is bounded. These properties hold true, for instance, in the case of inter¬ 
polation problems in reproducing kernel Hilbert spaces (see Appendix C). 

3.4. Generalized solutions for problems with discrete data 

We conclude this section by noticing that problems with discrete data can 
be formulated as (3.1), g being now a n-dimensional vector in an Euclidean 
space. In fact, ignoring the errors in the data, a linear inverse problem with 
discrete data can be formulated as follows^ 12 ! : 

Given a set {F;}" =1 of linear functionals defined on X and a set 
of numbers, find a function a F X such that 

9i = - 1,.. n. 


(3.12) 
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In particular, when the functionals F t are continuous on A" by Riesz 
Theorem (see Appendix A), there exist functions v n such that 

F,{u) =(u,il'i) Y , (3.13) 

where (*,-).Y is the inner product of A”. 

For example, the problem discussed in Section 2 takes this form when 
g{x) is measured in a finite set of points only, say x x , x 2 ,..., .r, t , and A" is an 
L 2 space. In such a case we have 

il'i(x) = K(xi,y). (3.14) 

It is important to recognize also that interpolation problems take this form 
when X is a reproducing kernel Hilbert space (see Appendix C). 

This problem is a special case of the problem 3.1 if we consider the data 
gi as the components of a vector g in an N-dimensional Euclidean space Y 
and if we define an operator L from X into Y by means of the relation 

{La) t = = 1(3.15) 

The operator L is not injective: X(L) is the infinite dimensional closed 
subspace of all the functions a orthogonal to the subspace spanned by the 
functions On the other hand, the range of £, R(L), is closed: R(L) is just 
Y when the functions 0 t are linearly independent, otherwise it is a subspace 
with dimension n' < n. 

Along the lines described above one can introduce generalized solutions 
or C-generalized solutions for problems with discrete data. Their deter¬ 
mination is always a well-posed problem in the strict mathematical sense. 
However, numerical stability cannot be guaranteed (see the next section). 

As a final remark, we point out that the problem of interpolation by 
means of spline functions can be formulated as a problem of determining 
a generalized or C-generalized solution in a suitable Hilbert space (see, for 
instance, [I2j, 113]). A simple example is discussed in Appendix C. 
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4. Well-conditioned and ill-conditioned problems 


As already remarked in previous sections, continuous dependence of the so¬ 
lution on the data does not yet mean that the solution is robust against 
noise. Generalized solutions of inverse problems with discrete data can pro¬ 
vide striking evidence of this fact. Therefore, it is necessary to investigate 
more carefully error propagation from the data to the solution when solving 
problem 3.1. 


We assume, as in Section 3, that R(L) is closed, so that the generalized 
inverse L + is continuous. We denote by /\g a variation of the data g and by 
Au + the corresponding variation on the generalized solution u + . Then the 
standard analysis of error propagation proceeds as follows: 


From Equation 3.8, because of the linearity of Z + , we get Aii + = L + Ag, 
which implies 

\\^ u II x — II + Ilii^^ll V'’ (4.1) 

where ||Z + || denotes the norm of the continuous (bounded) operator L + (see 
Appendix A). Analogously, from Equation 3.1, with u = u + , it follows that 


9\\y S 


u 


+ 


Y‘ 


Combining Equations 4.1 and 4.2 we obtain the inequality 

||A« + || 


.Y 


U 


+- 


< L II L 


+ 


.Y 



Y 


9 

Y 


(4.2) 

(4.3) 


It is important to point out that this inequality is precise in a certain sense. 
When L is an N x M matrix or L corresponds to an inverse problem with 
discrete data, then equality can hold. If L is an operator on infinite dimen¬ 
sional spaces, then one can always prove that the l.h.s. of Equation 4.3 can 
be arbitrarily close to the r.h.s. 


The quantity 

a = ||Z||||L + || > 1 (4.4) 

is called the condition number of the problem. When a is not far from 1, the 
problem is said to be well-conditioned , while when a is large the problem is 
said to be ill-conditioned. 


It is obvious that these definitions are not as precise as that of well- 
posedness. However, what is important in practice is the estimation of the 
condition number since it provides insight into the numerical stability of the 
problem. In the case where L is an N x M matrix, ||£|| is the square root of 
the maximum eigenvalue of the M x M positive semi-definite and symmetric 
matrix L L (notice that the positive eigenvalues of this matrix coincide with 
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the positive eigenvalues of the matrix LL* ) and ||£+|| is the inverse of the 
square root of the minimum positive eigenvalue of the same matrix, i.e., 



More generally, if we indicate by cr+{L*L) the positive part of the spectrum 
of the operator L*L, we have 

a = 

In order to provide an example of a well-posed problem which can be 
extremely ill-conditioned, we consider the finite moment problem , i.e., the 
problem of determining a function u(x), defined for example on [0,1], given 
its moment up to the order N - 1: 

9n - / x n ~ l u(x)dx; n =1 ,...,A r . (4.7) 

J o 

If we take X = L 2 ( 0,1), then it is easy to recognize^ 12 ! that the operator 
LL* is just the Hilbert matrix 

[H N (-l)lnm = ; 7 ; 71, 777 = l,...,i\T, (4.8) 

which is a classical example of an ill-conditioned matrix. From well-known 
results it follows that the condition number for the generalized solution of 
problem 4.7 is approximately given by q ^ exp(1.75N) and therefore it 
grows exponentially with N. Already for moderate values of N, a takes 
unacceptable values. 
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5. Regularization Methods 


When the range of L, R(L) is not closed, then the inverse L~ l or the gen¬ 
eralized inverse is not defined everywhere on Y and it is not continuous. 
Therefore, both the requirements of existence and continuity do not hold 
true. This is the most difficult case and appropriate techniques are required. 
An example ot operators in this class is provided by compact operators (not 
of finite rank) as shown in Appendix A. It is easy to see that an ill-posed 
problem has a condition number a = oo. Therefore, extremely ill-conditioned 
problems behave in practice as ill-posed problems and have to be treated by 
the same technique. 


5.1. Tikhonov regularization 

The most investigated approach to ill-posed problems is the regularization 
method of Tikhonovt 14 ^. The key idea is to introduce a family of continuous 
“approximations” of a noncontinuous operator. More precisely, a regular¬ 
ization algorithm for the generalized solution of Equation (3.1) is given in 
terms of a one-parameter family of continuous operators R\, A > 0, from Y 
into X, such that for any given g 6 R(L ), 

}im R\g = L + g. (5.1) 

Therefore, when applied to noise-free data g, R x provides an approximation 
of u + which becomes better and better as A —» 0. However, when R\ is 
applied to noisy data g £ = g +- n £ and n £ represents experimental errors or 
noise, we have 

R\ge = R\g + R\n e , (5.2) 

and the second term typically is divergent when A —► 0. It follows that a 
compromise between “approximation” (the first term) and “error propaga¬ 
tion” (the second term) is required. This is the problem of the “optimal 
choice” of the regularization parameter A. 

One of the most studied regularization algorithms is obtained by mini¬ 
mizing the functional 

\\Lu - g\\ 2 Y + Aj|Ch/|| 2 z = minimum, (5-3) 

where C is a constraint operator, satisfying for instance the conditions stated 
in Section 3. In the original paper of Tikhonov, it is given by 

2 

dx , 



(5.4) 
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where the weights e r (.r) are strictly positive functions and u ir) (x) indicates 
the reorder derivative of u{. r). If a x is the solution of (5.3), and if we put 

u\ = R x g , ( 5 . 5 ) 

then 

R x = [L*L + \C*C)~ l L*. (5.6) 

Notice that u x is unique when the Equations (3.11) have only the trivial 
solution u — 0 and that when A ► 0, g £ R(L), R x g converges to L^g ! 3] . 

Three methods have been proposed for the choice of A in Equation (5.6) 
and in the case of noisy data <7,: 


(i) Among all u such that Cu z E find u that minimizes \\Lu~g £ \\ Y - l 13 ]. 
Using the method of Lagrange multipliers the solution of this problem 
can be reduced to the solution of Equation (5.3), with A arbitrary, and 
to the search of the unique A such that 


Cu 


\Wz 


E. 


(5.7) 


(ii) Among all u such that ||Z« - g £ \\ v < c, with given 5. find u that min¬ 
imizes ^ 1 ^ b Again, the solution of the problem is equivalent 

to finding the unique A such that 

II^A * 9 AW = (5.8) 

This is also called Morozov’s discrepancy principle. 


(iii) Among all u such that |jZ« 3h||v' S ||C«||^ < E , find a a of the 
type (5.5). This is equivalent 118 ^’! 19 ] to taking 

* = {e/E)\ ( 5 . 9) 


The first method consists of finding the function 11 that satisfies the 
constraint \\Cu\\ z < E and best approximates the data. The second method 
conputes the function a that is sufficiently close to the data (5 depends on the 
estimate of the errors) and is most “regular”. In the third method, one looks 
for a compromise between the degree of regularization and the closeness of 
the solution to the data. 


5.2. Regularization and filtering 

The regularized solution (5.5), (5.6) takes a very simple form in the case 
where L is compact and C I the identity operator in X. Then, using the 
singular value decomposition of L (see Appendix A) we obtain 
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«A ~= 


X>d 


a] 


+ A «* 


-(<b '*’*)v'Wjbi 


(5.10) 


and therefore the regularized solution is essentially a “filtered” version of the 
noil-regularized (generalized) solution of Equation (3.1), 


u 


+ 



(5.11) 


This remark suggests that, in this case, one can define regularization algo¬ 
rithms in terms of filter functions $*(A): 


“A = ]T$Jfe(A ) — (g,v k ) Y Uk ( 5 . 12 ) 

h a k 

satisfying the conditions: (i')$*(A) < 1; (ii')$*(A) -> 1, for any k, when 
A -v 0; (iii')T fc ^ bounded for any k and any A > 0. Such a procedure is 
often used in the inversion of compact operators as well as in the inversion 
of convolution operators^(see Section 8). 


5.3. Smoothing and interpolation 


As already remarked, regularization algorithms can be used also for ill— 
ditioned problems. A well-known example is the smoothing of a function 
whose values, specified on a finite set of points, are affected by errors^ 20 !. It is 
interesting to compare smoothing and interpolation by means of cubic splines 
using the framework outlined above. Interpolation of a function «(.r),.r <E 

[0,1], is the problem of searching for a function which takes the prescribed 
values 


u{x,)=g i ; 

and minimizes the seminorm 113 ! 

P(u)= f 

J 0 


i = 1 ,... , n 


u"{x)\ 2 dx. 


(5.13) 

(5.14) 


Therefore, the interpolation problem is equivalent to the computation of a 
generalized solution. On the other hand, the smoothing problem is formu¬ 
lated again as the minimization of the seminorm (5.14), but condition (5.13) 
is replaced by 


n 

^ M-c.) - 9i 
i= 1 


2 


2 


(5.15) 
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(for simplicity, we have assumed that the errors on the data have the same 
variance). Therefore, the smoothing problem corresponds to method (ii) for 
the choice of the regularization parameter. 


5.4. Cross validation and generalized cross validation 


We conclude this section with a short description of the cross validation 
method! 2 ^' [22] . This is a method for the choice of the regularization parameter 
and it has been applied to smoothing problems and also to the solution of 
Fredholm integral equations of the first kind in the framework of the method 
of collocation (or moment-discretization). However, it applies to any linear 
inverse problem with discrete data, as formulated in Section 3. 

The idea behind cross validation is to allow the data points themselves 
to choose the value ot the regularization parameter by requiring that a good 
value ot the parameter should predict missing data points. In this way, no a 
priori knowledge about the solution and/or the noise is required. 

Let (Lu)i be defined as in (3.15) and let be the minimizer of the 
functional 

,.i i V"" 

i 2 


n i ^ k 


9 i T A u [ | y • 


(5.16) 


Then the cross validation function F 0 (A) is defined by 


(5.17) 


fc = l 


and the cross validation method consists in determining the value of A, say 
A, which minimizes (5.17). The computation of the minimum is based on 
the relation 


Vo( A) 


il 


(La*), 


9k 


fc = l l x ~ Akk( A) 

where ii\ is the minimizer of the functional 

n 


|2 ’ 


(5.18) 


Fa 


u 


k= 1 


g i + A11 u 


2 

X ’ 


(5.19) 


and Afcjfc(A) is the kk-th entry of the n x n matrix 

.4(A) = LL*{LL* +■ A/) -1 , (5.20) 

where LL* is the Gram matrix of the functions 0 (see Equation 3.15). 

It has been shown 2 '* i that, from the point of view of minimizing predic¬ 
tive mean square error, the minimization of F 0 (A) must be replaced by the 
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minimization of the generalized cross-validation function defined by 

V(A) =(-Tr[I - A(\yr 2 (-\\(I - ^4( A) )<7!| 2 ), (5.21) 

n n 

where j| -|| denotes the Euclidean norm and Tr is the trace operation. An 
important, property of V(\) is the invariance with respect to permutations 
of the data. 
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6. Regularization of nonlinear problems 


The case of nonlinear ill-posed problems is quite difficult and, for the mo¬ 
ment, no general approach seems to exist. 

If A is a nonlinear operator from a Hilbert space X into a Hilbert space 
1, we have the equation 

9 = A(u). ( 6.1) 

Obviously, a solution of this equation exists if and only if g is in the range 
of the operator A. 

6.1. Linearization 

The simplest way of treating Equation (6.1) is to try to linearize the problem. 
This is the case of a differentiable operator^ 2 ^ . The nonlinear operator .4 has 
a first derivative at the point u 0 if there exists a linear operator L 0 : A r —► Y 
such that, for any u G -Y, 

limy [A{ u a T tu) - -4('it 0 )] = L 0 u. (6.2) 

The operator L 0 is called the first derivative of .4 at the point u Q and one 
usually writes 

L 0 = A'(« 0 ). (6.3) 

An operator which is differentiable at the point u 0 is also continuous at that 
point. 

If an approximation u a of the solution of Equation (6.1) is known and 
if the operator A is differentiable at u 0 , then Equation (6.1) can be approx¬ 
imated by the linear equation 

dg Q = L 0 da 0 , (6.4) 

where dg 0 = g — A(u 0 ), da 0 = u — u o: and L 0 is the derivative of A at u 0 . 
Obviously, the procedure is consistent if the solution du 0 of Equation (6.4) 
is a “small” correction to the approximate solution u 0 . 

The procedure can be iterated. By means of the solution du 0 of Equa¬ 
tion (6.4), one gets a new approximation, u l = u, 0 -f du Q , of the true solution 
a. Then one considers the linear equation dg\ — L^dii j, where L x = A'{u! ), 
&9 1 — 9 ~ A(f/ j), and du\ — a — u j. By solving this equation one gets a new 
approximation a2 - f du 1 and so on. It is easily recognized, by writing 
Equation (6.1) in the form P(u) = 0 with P{u) = A{u)-g, that this method 
is just an extension to functional equations of a method which, in the case 
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of real equations, is known as Newton’s method or the method of tangents. 
Such an extension is also known as the Newton-Kantorovicli method and it 
is one of the few practical methods for the actual solution of a nonlinear 
functional equation. 

The iterative algorithm can be put in the following form: 

“n + l = U n + [4'(ttn)l _1 [0 ~ 4(u n )]; (6.5) 

and a simplified algorithm is given by 

M »+1 = «n + [-4 '(m 0 )] _1 [flr - A{u n )}. (6.6) 

Sufficient conditions for the convergence of both iterative algorithms have 
been given^ 2 ^. They include the continuity of the inverse of the derivative 
of the operator A. In several inverse problems this condition is not satisfied. 
It has been suggested^ 25 ! to use, at each step of the algorithm, a regularized 
approximation of the inverse of the derivative of the operator A. Convergence 
results for such a modified algorithm are not yet available. 

6.2. Generalized and regularized solutions 

Extensions of regularization theory to ill-posed nonlinear problems have also 
been proposed: the case of nonlinear integral equations has been investigated 
by Tikhonov^ 26 ! and an abstract approach is given by Morozov^ 27 ]. 

We assume that A : X —► Y is a continuously differentiable operator, 
i.e., that A has a derivative at each point u £ X and that this derivative is a 
linear continuous operator. However, even in the case of such a simplifying 
assumption, a well-developed theory of generalized inverses does not exist. 
One can introduce least-squares solutions of Equation (6.1) by solving the 
variational problem 

11 A(a) - g |y = minimum, (6.7) 

analogous to the problem (3.3). "W hen a solution of such a problem exists 
for any g € Y, one says that Equation (6.1) is strictly normally solvable. 
A sufficient condition for strict normal solvability is that the range of A is 
weakly closed in Y 28 L Notice that this condition may be stronger than the 
condition of closure of the range which applies to the case of linear operators 
(Section 3). Weakly closed sets are (strongly) closed, but the converse is not 
always true. 

If, for a given g , the set of least squares solutions is not empty, one could 
try to select one of these solutions by means of another variational principle 
as in Section 3.1; i.e., by minimizing a norm or seminorm such as (3.10). 
In contrast to the case where the operator A is linear, the generalized or 
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C-generalized solution defined in such a way may not exist and, even if it 
does exist, is not necessarily unique. Such a lack of uniqueness applies also 
to the case of regularized solutions (in which case, however, existence can be 
easily assured). 

The basic point in the definition of regidarized solutions is again the 
minimization of a functional similar to (5.3); i.e., 

= ll^(n) - g\\ 2 v + A||Ct/|||. (6.8) 

The uniqueness of the minimum of $aM usually is not proven (but see [26) 
for a special case where uniqueness holds true). However, it is not difficult 
to prove the existence of at least one local minimum. Here we give the proof 
under conditions which are satisfied in the case of the problem of shape from 
shading (Section 11). 

Assume that the operator A : X —> Y is continuous everywhere and 
that the constraint operator C : X —> Z is linear and has a compact inverse 
C~ l . (This condition is satisfied, for instance, by the differential operator 
(5.4)). Then, for any A > 0, the functional (6.8) has at least one minimum 
point u\. The proof goes as follows: 

Let M\ be the lower bound of $\[u] and let {u n } be a minimizing 
sequence such that 

M\ < #A[«n] < M\ + 1/n. (6.9) 

It follows that: 

ii x-t || /1 •, r i \ 1/2 / A/a + lxi/2 

\\Cu n \\z < (-^aM) < (— -) ; (6 .10) 

and therefore, the sequence {v n = Cu n } is bounded. Since C~ l is compact, 
we can extract from {u ri = C 1 e n } a subsequence strongly convergent and 
such that the corresponding subsequence of the v n is weakly convergent. 
Without loss of generality, we can assume that these conditions are satisfied 
by the sequence {u n } itself. Then, let u x be the strong limit of {u n } and v\ 
be the weak limit of v n ; it follows that u\ = C~ 1 v\. 

Since Cu n weakly converges to Cii\ , from the lower weak semicontinuity 
of the norm we have 

\\Cu\\\z < liminf ||Ctt n || z . (6.11) 

On the other hand, from the strong convergence of u n to u x and from the 
continuity of A(u) we have 


|!.4(«a) - g\\y = lim ||A(w n ) - g\\ Y , 
and, by combining Equations (6.9), (6.11), and (6.12), we get 


( 6 . 12 ) 
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M\ < $[u a{ < liminf $ A [«»] = lim$ A [« n | = M x . ( 6 . 13 ) 

It follows that 


$!**a1 = A/a, (6.14) 

and the existence of the minimum point is proven. 

As stated above, in general nothing can be said about the uniqueness of 
the minimum of the functional ( 6 . 8 ). However, if we assume that: 

(a) for a given g , Equation ( 6 . 1 ) has a unique solution u in the domain of 

C; 

(b) in a neighborhood ol u , the operator .4 has everywhere continuous first 
and second derivative; 

(c) the derivative of A at a, A'(u). is invertible; 

then, by a rather easy generalization of the theorems contained in [26], one 
can prove that if g s is noisy data, with \\g - g € \\ Y ^ and if in the func¬ 
tional ( 6 . 8 ), with g replaced by g t , we choose the regularization parameter 
A in such a way that A = where 7 is an arbitrary constant, then any 
minimum point of such a functional converges to u when 5 f—> 0 ; therefore, 
for sufficiently small values of t, there exists only one minimum point. 
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7. Stochastic route to regularization 


When a priori knowledge of statistical properties of the signal and of the noise 
is available, a probabilistic version of regularization methods is possible^ 29 '’! 30 !’ 

[31] 


Here we consider a Bayesian approach that has the advantage of showing 
the connection between Markov Random Field models and standard regu¬ 
larization as developed in this paper. In particular, we will be able to see in 
which sense standard regularization is a special case of MRF models and is 
itself equivalent to Wiener filtering. 

The first step is to write Equation (3.1) in this form, 

g = Lu + w, (7.1) 

where w is a function representing the effect of the noise on the data. Notice 
that in this representation no assumption is implicit about the structure of 
the noise (additive noise, signal dependent noise, etc.). 

The second step is to assume that there exist stochastic processes u, < 7 , tr, 
related by 

g_ ~ Lu + w, ( 7 . 2 ) 

and that the functions g , u, w appearing in Equation (7.1) are values of the 
processes a, g , w , associated with a specific outcome of a given experiment 
(we use here the nomenclature introduced in [32]). 

For simplicity, we also assume that the processes u , </, w_ have zero mean. 
This assumption is in fact not restrictive because, if it is not true, it is always 
possible to introduce processes satisfying this condition just by subtracting 
the means from the original ones. Thanks to the linearity of I, relation (7.2) 
holds true also for the new processes. 

When the mean is zero, the autocorrelation and the autocovariance of 
a process u coincide. If the values of the zero mean process u are functions 
of a variable ,r (eventually multi-dimensional), the autocovariance function 
of u is 

Cyr,.^) = E{u(z)u(z')}, ( 7 . 3 ) 

where E indicates the expected value. As in the previous sections, we assume 
that the functions u , the values of the process a, belong to a Hilbert space A" 
(for instance, a space of square integrable functions) and that the functions 

values of the processes g , ic respectively, belong to the same (possibly 
different from X) Hilbert space Y. The appropriate description of stochastic 
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processes with values in Hilbert spaces is given in terms of weak random 
variables or cylinder set measures^ 33 -. 

Then, the autocovariance function of the process u can be considered as 
the kernel of an operator defined on the space X : 

(Ru<p){x) = j Cu(x,x')(f>(x')dx.\ 4> G -Y. (7.4) 

The operator is called the covariance operator (or the covariance) for 
the process u. It can also be defined for weak random variables with values 
in an abstract Hilbert space A" ^ 33 ^ . 

Coming back now to our basic equations ( 7.1), (7.2), the inverse problem 
consists in estimating a value of u , given an observed value g of g and given 
a priori probabilistic knowledge on the processes u and w. 

We take a Bayesian approach and write the a posteriori probability 
density as 

P{u/g) = constP(u)P(g / u) (7.5) 

where P(u) is the a priori probability density of process u and P(g/u) is the 
conditional probability density of the data g given u. 

We consider now the special case of u being a gaussian process (or 
equivalently the linear transformation - such as a derivative - of a gaussian 
process). In this case, the a priori probability distribution of u is 

P(u) = const •exp[-i(u,fl~ 1 a) ]. (7.6) 

Let us assume that the noise process w is additive, white and gaussian 
with variance a 2 . Then the a priori probability P(g/a) can be written as 

P{g/u) = const • exp [- — \\g - Lu\\ 2 y }. (7.7) 

l(j L 

Depending on the optimality criterion there are now several ways of 
obtaining the best estimate of u given the data g. A commonly used estimate 
is the Maximum A Posteriori (MAP) estimate 

P ^ a bestJs) = nmx{P{u/g)\u 6 A}. (7.8) 

From Equations (7.5) (7.7) we have 

P(u/g) = 

const .etp[—d—( || La -ally- + <r 2 (u, R~ *«).y)]- (7.9) 

CT 


If we put 


Ru = (c*cr\ 


(7.10) 
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then from Equations (7.8) and (7.9) we have 

?/ best) = niin{A/(-u)l« € A"}, (7.11) 

where 

M{u) = ||Lm - g\\y 4 - cr 2 ||C«!|^. (7.12) 

It follows that — F 0 g where F 0 is given by (5.6) with A = a 2 . If 

we put Rw = <t 2 J, where I is the identity operator in Y (R w is the covariance 
operator of white noise), then F 0 can also be written in the following form: 

F 0 ~ Ru_L (LRuL* 4 - RuJ , (7.13) 

with Ru given by Equation (7.10). 

The operator F 0 is sometimes called Wiener filter (or Wiener-Kolmo- 
goroff filter) and is quite similar to the operator (5.6)^ 31 ^. In other words, the 
regularizing operator (5.6) is equivalent to a Wiener filter in the case of white 
noise , provided that the constraint operator C is related to the covariance 
operator by the relation (7.10). 

Most of our previous assumptions can be relaxed in a more general prob¬ 
abilistic scheme based on the formalism of Markov Random Fields (MRF) 
defined on finite lattices. In particular, the noise may not be additive, the 
operator L may not be linear and P(u ) does not need to be gaussian. 

MRF models have been formulated for several problems in early vi¬ 
sion. Under simplifying assumptions they reduce to the discrete equiva¬ 
lent of standard regularization. Though they are computationally expensive 
they may represent a powerful extension of the methods described in this 
paper^ 34 ^ . 
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Part Two 


The initial stage of machine vision, now called early vision, consists of 
distinct but interrelated problems like “edge detection,” “computation of 
optical flow, ’ “structure from motion,” “structure from stereo matching,” 
etc. From a theoretical point of view, these problems can be considered as 
independent, at least to a first approximation. The integration of various 
outputs is performed at a higher stage, where geometrical reasoning and 
much a priori information will be used. These different modules may reflect 
processing occurring in our brain, where simult aneously we compute different 
information from images: we can extract rapid changes in image brightness 
(edge detection); we can recover the shape of an object from its shading 
(shape from shading); we can understand the motion of objects from the 
changing images (computation of optical flow); we recover the 3D structure 
of a scene from a pair of images (structure from stereo); and we are able to 
have a dense description of 3D surfaces from sparse features (visual surface 
interpolation). 

Several of these problems have been recently solved with variational 
techniques, in particular by Horn, Grimson and Hildreth. We will show that 
many of these results and several new ones - in particular existence and 
uniqueness of solutions - are direct consequences of mathematical results 
presented in Part One. 


Part Two is divided in four sections, each dealing with a main topic of 
early vision. Section 8 presents the ill-posed nature of numerical differenti¬ 


ation. In Section 9 we show how recently obtained mathematical results on 
optical flow [ 37 ]’[ 38 l’[ 3 h'i l5 are straightforward consequences of regularization 
theory. Section 10 discusses a recent approach to surface interpolation, il¬ 


lustrating how variational principles 


[ 39 ],[ 40 ],[ 41 ],[ 42 ],[ 43 ],[ 44 ] 


can be viewed as 


regularized solutions to discrete ill-posed problems. Section 11 reviews recent 


variational approaches to shape from shading, in the framework of regular¬ 
ization theory. 
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8. Edge detection and numerical differentiation 


Recently standard regularization techniques have been applied to a classical 
problem of early vision - edge detection. Edge detection, intended as the 
process that attempt s to detect and localize changes of intensity in the image 
(this definition does not encompass all the meanings of edge detection) is a 
problem of numerical differentiation^ 46 '. As we will see in the next section, 
differentiation is a common operation in early vision and is not restricted to 
edge detection. Differentiation is ill-posed because the solution does not de¬ 
pend continuously on the data. The intuitive reason for the ill-posed nature 
can be seen by considering a function f{x) perturbed by a very small (in 
L 2 norm) “noise” term eshiQx. f{x) and f{x) + esinttx can be arbitrarily 
close for very small e, but their derivatives may be very different if SI is large 
enough. This simply means that differentiation “amplifies” high-frequency 
noise. Differentiation can also be seen as the recovery of the solution u to 
the inverse problem g = Lu where 


L : X 


Y 


(Lu)(x) = / u{y)dy 

V — OO 


( 8 . 1 ) 


Thus u. is the derivative of the data g. The operator (8.1) is not bounded in 
L 2 ( — oo, oo), and the range of L is not closed. Therefore, the inverse problem 
is ill-posed . 


8.1. Regularization of differentiation 

As shown in Section 3, it is possible to restore well-posedness by redefining 
the solution space AT and the data space Y. Let us redefine the solution 
space X as the subset X' of square integrable functions f(x) in (-oo, +oo) 
such that 


. 

(1 + — )|f(u0| 2 <iu; (8.2) 

-oo ^ 

exists, where F(u>) is the Fourier transform of f(x). The new data space 
Y' is simply the range of L. It is easy now to see that the inverse problem 
L : A"' —> Y' when the operator L is defined as in Equation 8.1 is well-posed. 

Differentiation can be transformed into a well-posed problem by using 
Tikhonov’s regularizing operators. For equations of the convolution type, 
such as (2.1) with h(x,y) = R(x —y), the regularizing operators correspond 
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to convolving g{x) with a filter /(.r, A) (where A > 0 is the regularization pa¬ 
rameter) whose Fourier transform F(u>. A) satisfies the following conditions: 

(i) 0 < F{u>, A) < 1 for A > 0 and all 

(ii) F(u> , A) is an even function with respect to and 6 L 2 ( -oo, foe); 

(iii) F(<jj,\)jaj belongs to L 2 {— oo,-foo) for any A > 0; 

(iv) lim A _o TV, A) = 1. 

This regularizing filter is equivalent to a smooth low pass filter. Three 
main types of filtering have been used in edge detection. We will list their 
main properties below. 

8.2. Band-limited filters 


Band-limited filters are an obvious choice for regularizing differentiation, 
since the simplest veay of avoiding harmful noise is to filter out high frequen¬ 
cies that are amplified by differentiation. Linear and circular prolate func¬ 
tions constitute an interesting class of band-limited filters^ 47 ^ 48 ! and have 
already been used in edge detectionf 49 ^ . These filters satisfy all conditions 
of Tikhonov needed to regularize differentiation, if we take the inverse of the 
hand-width as the regularization parameter. 

8.3. Support-limited filters 


All real filters have a finite extension and are support-limited. A class of 
support-limited filters that has been used in edge detection^ 50 ! is the so- 
called difference of boxes (DOB). These filters are Haar functions which 
form a basis for square integrable functions on a bounded interval. However, 
these filters do not satisfy Condition (iii) above, and therefore cannot be 
used to regularize differentiation. This conclusion derives from the fact that 
the Haar functions are discontinuous. As a consequence, the limit of their 
Fourier transform for u; going to infinity tends to zero as a? -1 . 


It is possible, however, to introduce smooth support-limited filters whose 
Fourier transform tends to 0 as desired asai-^oo. If the function /(.r, A) has, 

for instance, continuous derivative up to order p and the (p-fl )^ 1 derivatr 


rive is 


integrable, then F( u>,A) tends to zero as |u;| Furthermore, if / (.r, A) 
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is C°°, then F (u>, A) tends to zero more rapidly than any inverse power of 
An example is provided by the function 


/(.c. A) 



x\ < A 
«r I > A 


(8.3) 


where C\ is a constant such that F(0,A) = 1. Therefore, the best support- 
limited filter for edge detection and numerical differentiation is not the DOB 
but the filter (8.3), which is often used in digital signal processing when 
aliasing needs to be reduced. 


8.4. Filters with minimal uncertainty 

The Gaussian function minimizes the product of spread in the space and 
in the frequency domain^ 32 ' and can be viewed as a filter with minimal un¬ 
certainty. Filtering with a Gaussian function regularizes differentiation, be¬ 
cause the Gaussian function /(.r, A) = exp( -.c 2 /A 2 ) satisfies all conditions of 
Tikhonov. Moreover, filtering with a Gaussian transforms a continuous and 
bounded function into an entire function. 


8.5. Interpolation and approximation 

Numerical differentiation can also be regularized in a different way. It is 
possible to interpolate or approximate the data with an analytic function 
and subsequently compute the analytical derivative of the interpolating or 
approximating function. 

For instance in ID the “image” model is y t = /(*,•) + e t , where y t is the 
data and e t represent errors in the measurements. We want to estimate f 
from an interpolating or approximating function /. We can choose a regu¬ 
larizing functional ||C/|| 2 = J (f"(x)) 2 dx , where /" is the second derivative 
of f . This choice corresponds to a constraint of smoothness on the intensity 
profile. Its physical justification is that the (noiseless) image is indeed very 
smooth because of the imaging process: the image is a bandlimited func¬ 
tion and therefore has bounded derivatives. We can decide that the data 
are noiseless and therefore we want to interpolate the data. This procedure 
is equivalent to interpolating the data with cubic splines and differentiation 
can be safely obtained by the analytical differentiation of the interpolating 
spline. If the data are noisy and we want to take errors in the measurements 
into account, we can look for an approximating function minimizing 

-/(*i )) 2 + A J (f"(x)) 2 d*. 


(8.4) 
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Ill [52] it was shown (a) that the solution /(.r) can he obtained by 
convolving the data y t (assumed on a regular grid and satisfying appropriate 
boundary conditions) with a convolution filter R , and (b) that the filter R 
is a cubic spline with a shape very close to a Gaussian and a size controlled 
by the regularization parameter A. Differentiation can then be accomplished 
by convolution of the data with the appropriate derivative of this filter. The 
optimal value of A can be determined for instance by cross validatioiJ 21 ^ 22 ] 
and other techniques. This corresponds to finding the optimal scale of the 
filter^ 46 ]. 

These results can be directly extended to two dimensions. The resulting 
filters, which are spline filters for discrete data and Butterworth-like filters 
for continuous data (they are eventually indistinguishable in practice) are 
very similar to the derivatives-of-a-gaussian extensively used in recent years 
[ 46 ],[53],[54],[55] . jf re g u l ar j z i n g functional \\Cf\\ is 

J J (V 2 grad f)\1x dy, (8.5) 

where V 2 indicates the Laplacian and grad/(.r, y) the gradient of f(x,y), it 
has been showiJ 5 ' that the solution f(x.y) can be obtained by convolving 
the data g{x,y) with the filter 


n , v 1 J 0 (u/C) , 

R2(x ' y) = 2h wrrr^’ < 8 - 6 > 

where Jq is the zero order Bessel function and ~ = y 7 x 2 f i/ 2 . If the 
regularizating functional is 

11 (8.7) 

the filter becomes 

if 

Therefore numerical differentiation can be regularized in a number of 
ways which are all consequences of the results presented in Part One. There 
are two main possibilities: filtering the data with appropriate derivatives of 
Tikhonov filters; or interpolating (or approximating) the discrete data with 
splines and then performing an analytical derivation. These two regularizing 
procedures are equivalent. 
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9. Computation of optical flow 


A major aim of early vision is the segmentation of the visible scene in regions 
corresponding to distinct rigid objects. Motion is an important source of 
information for this goal. The imaging device projects the 3D velocity field 
of viewed objects into the image as a 2D field. When a moving scene is 
viewed by a camera it is possible to recover directly the optical flow. The 
optical flow 139 is commonly defined as the distribution of apparent velocities 
of movement of brightness patterns in an image. Optical flow and the 2D 
motion field are related and their relationship has been carefully analysed in 

[45] • 


In this section we discuss two different approaches to the computation 
of optical flow. Horn and Schimck^ 39 ' derived equations relating the change 
in image brightness E(x,y,i) at a point {x,y} and time t to the motion 
of brightness pattern. Their key assumption is that the brightness of a 
particular point in the moving pattern is constant, so that the total derivative 
of E{x, y,t) is zero: 

dE, 

— (.c, y,/) = 0. ( 9 . 1 ) 

Then, from local measurements of the partial derivatives of E(x, y, t) with re¬ 
spect to space coordinates and time, it is possible to estimate the component 
of the velocity field parallel to the gradient of E{x,yJ). The normal com¬ 
ponent is not determined and it must be recovered (see [45] for an analysis 
of the validity of the underlying assumptions). 


Hildreth 1371, t 38 ^ suggested computing the optical flow not over the entire 
image but only along 1-D contours. In real images, these 1-D contours are 
edges corresponding to sharp changes in image brightness (see Section 8). 
Hildreth^ 37 !’ 138 ' observed that it was possible to obtain the normal vectors 
along the contour by a simple inspection of the extracted edges: if E(x,y , t) 
is again the image brightness, then the normal component v 1 - of the local 
velocity vector V at the points of the contour T is given by 


v 


L 


0 rtn 
E 


r 


(9.2) 


where V 2 is the Laplacian. A better estimate of e 1 , however, is 

L _ d d 2 E 
dt On 2 r’ 

where d 2 / dn 2 is the second derivative along the direction of the 


(9.3) 
gradienti 46 ^. 


In this section we will discuss the ill-posed nature of the recovery of 
optical flow as proposed by these authors. 
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9.1. Optical flow along a contour 


We first consider tlie problem of determining the two-dimensional optical flow 
along a contour T in the image assuming that local motion measurements 
along the contour provide only the component of the velocity in the direction 
perpendicular to the contour. The component of velocity tangential to the 
contour is invisible to local detectors that examine a restricted region of the 
contour. The local velocity vector T'(s) is decomposed into a perpendicular 
and a tangential component to the curve 

F(s) = v T (s)t + r i (5)n. (9.4) 

Here s is the arclength and /, n are unit vectors respectively tangent and 
normal to the contour T 

r* ( cos 0 \ ( — sill 6 \ 

‘Wsm*) < 9 ' 5 > 

where 8 is the angle between t and the unit vector of the r-axis. They depend 
also on s but we omit this dependence for simplicity of notation. 

The component and the vectors t,n are given by direct measure¬ 

ments and therefore are the data of the problem. We will denote by g(s) the 
measured values of ^(s) and by g(s) the corresponding velocity field 

g(s) =g{s)n. (9.6) 

Then the problem can be formulated as the inversion of a projection operator 
in the space X = Y = £ 2 (T) © I 2 (T) (X 2 (T) denotes the space of square 
integrable functions defined over T). The norm of a velocity field V G A r is 
defined by 

!I V II x = J s) -V{s)ds = 


L 


e T (s) "ds + / \v ± (s)\*ds. 


I 


.-Li 


(9-7) 


The projection operator is 

LV(s) = v L {s)n , (9.8) 

and the set of the solutions of the equation 

LV = g, ( 9 . 9 ) 

with g defined by Equation (9.6), is the set of the velocity fields V given by 

F(s) = 4is)t + g{s)ri, (9.10) 

where g(s) is the given data function and v’(s) is an arbitrary function in 
X 2 (T). The generalized solution, or solution of minimal norm, exists for any 
data function g(s ), but it is trivial since it is given by 

V + = a 


(9.11) 
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In other words, the generalized solution restores well-posedness, but it gives 
a solution which does not have any physical relevance. Therefore, one has to 
look for suitable C-generalized solutions corresponding to physically accept¬ 
able velocity fields. 

9.2. A C-generalized solution for the optical flow along a 
contour 


A seniinorin introduced by Hildreth gives a very useful constraint for the 
recovery of the optical flow. Put Z = A' = L 2 {T) ® L 2 (T) and introduce the 
operator 

cv = V (9.12) 

where the dot means derivation with respect to s. Then the C-generalized 
solution is the velocity field of the form (9.10) which minimizes the functional 

WCVftx 77 J r V • (9.13) 

It is easy to show that existence and uniqueness of the C-generalized 
solution can be derived from the general result given in Section 3.3. 

First, consider the question of uniqueness. We know that the C-gener- 
alized solution is unique if and only if the intersection of N(C) and A r ( L) is 
the null element (Condition (i) of Section 3.3). Now N(C) is the set of the 
constant velocity fields (or translations), say V = a. Furthermore, N{L) is 
the set of the velocity fields orthogonal everywhere to ??, i.e., v- n = 0. This 
condition can be satisfied by a £ 0 only if n is constant; that is, only if T is 
a straight line. Therefore if T is not a straight line, the intersection of A r (C) 
and iV (L) is always the null element, and uniqueness is restored by the use 
of the C-generalized solution (9.13). 

The existence of the solution follows from the fact that the operator 
(9.12) satisfies Conditions (ii) and (iii) of Section 3.3. Condition (ii) is a 
rather general property of differential operators (see Appendix B for com¬ 
ments), and Condition (iii) is also verified because N{C) is a two-dimensional 
subspace of X = L 2 (T) tP I 2 (T). Therefore, we can conclude that the C- 
generalized solution exists whenever g e LD(C). 

In order to see more precisely the meaning of the last condition, assume 
that the contour T consists of a finite number of regular arcs, so that the 
tangent is continuous on T with the exception of a finite number of points, 
si, s 2 ,.. ., s n , where the tangent has both right and left limit. Then a solution 
V(s) of the form (9.10) is in the domain of the constraint operator C if 
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4'{s)t and g{s)n are differentiable on each regular arc and furthermore they 
satisfy suitable conditions at the discontinuity points s; in order to assure 
the continuity of V(s). We can derive these conditions from the equations 

—4 —^ 

^ 4- ( s i ) ^ ^ i)» ^ (9.14) 

where -f and - denote respectively right and left limit. It follows that 

V’+(«i) = (sin 4>i)~ l [g + {s t ) cos (/>, - g-is, ) 

= (sin 4>i)~ l [flf+(si) - g-(si)c os&], (9.15) 

where sin <p = /_ • = —t + • r7_,cos^> = n+ • n_ — f_|_ • t_ (f + is the right 

limit of the tangent, etc.). Therefore, if g(s) admits a right and left limit at 
the points s;, it is possible to derive from Equation (9.15) the right and left 
limit of V’. All these conditions characterize the subset D{C ) which contains 
the unique solution which minimizes the seminorm (9.13). Of course, if g is 
not differentiable on the regular arcs or does not have left and right limits at 
the discontinuity points, the C-generalized solution does not exist. It follows 
that the problem is ill-posed. 

Before discussing this point, we want to point out that, if the data 
g are not affected by noise, the C-generalized solution coincides with the 
true solution in two important cases [37 ^ 38 J: the first is a translation of an 
arbitrary contour and the second is an arbitrary motion of a rigid polygon. 
These results can be derived from the Euler equation for the C-generalized 
solution. 

Assume that the regular arcs have a differentiable curvature. From the 
following relations, which are true on each regular arc 

t = 0ii , n = —Ot (9.16) 

where 0 is just the curvature, one can derive from Equation (9.10) 

V{s) = [0(s) -0(s)g{s)}t + [£(s) 4 0(s)V’(s)]n (9.17) 

and therefore, when 4' satisfies the conditions (9.15) 

\\cv\\ x = (l^( 5 )l 2 + \0{s)g{s)\ 2 }ds+ 

+ I^Wis) I + |0(s)t/’(.s)| 4- 20(j>)#(s)V>(s) - 20{s)g(s)4'{s)}ds (9.18) 

This is a functional of which is an arbitrary function except for being 
differentiable and satisfying conditions (9.15). Then, by annihilating the first 
variation of this functional, it follows that, on each regular arc, the function 
4' which minimizes the functional is solution of the differential equation 

-ii s ) + |0(s)| V(s) 4 - 2 0{s)g(s) + 0(s)g(s) = 0. (9.19) 
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In the case of a closed contour, the C-generalized solution is given by 
the unique solution of Equation (9.19) satisfying the conditions (9.15). If 

the contour is regular everywhere, then one has to add boundary conditions 
such as 

0(0) = V’(7) , V’(0) = 0(/). (9.20) 

When the contour is open, one needs boundary conditions at the end points 
of the contour. These can be obtained directly, through a partial integration, 
from the annihilation of the first, variation of (9.18) 

0(0) = 0(O)tf(O) , 0(/) = 0{l)g{l). (9.21) 

However, these conditions are correct only in the case of a pure translation. 
In the general case it is necessary to measure the tangential velocity of the 
end points and take 

0(0) = e T (0) , 0(0 = e T (0, (9.22) 

where r r (0) and r r (/) are the measured values. 

If the motion of the contour is a pure translation and a = {a 1? a 2 } is the 
constant velocity field, the noise-free data are given by 

g{s) = -oj sin 0 -f a 2 cos 0. (9.24) 

Then, if we put 

0(s) = cos# + a 2 sin#, (9.25) 

taking into account that 0 = 9g, g = -00, it is easy to verify that 0 
satisfies Equation (9.19). In the case of an open contour, also the boundary 
conditions (9.21) are satisfied (the boundary conditions (9.15) are obvious 
since the velocity field is continuous). 

In the case of a rigid polygon, since an arbitrary rigid motion consists 
of a translation plus a rotation, on each segment of the polygon both the 
normal and tangent velocity are linear functions of the arclengtli s. But, on 
a segment of a straight Hue, Equation (9.19) becomes 0(s) = 0 and therefore 
4'{s) is a linear function of s. The boundary conditions (9.20) (plus the 
boundary condition (9.22) in the case of an open polygon) give the correct 
values of the constants provided that also in this case the measured values 
are noise-free. 

As we already remarked, the difficulty of this approach is that the prob¬ 
lem of determining such a C-generalized solution is ill-posed. For this rea¬ 
son, in the case of noisy data, one has to look for a regularized approxima¬ 
tion of the C-generalized solution, which can be obtained by minimizing the 
functionalt 37 ^ 38 ] 

*\{V\ = II LV - g\\ 2 x + X\\CV\\l 


(9.26) 
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If we denote by V\ the minimum of the functional (9.26) and if we put, 

V\ = 4- 4>\(s)ri (9.27) 

then it is easy to show' that, on each regular arc, 0 A and (p\ must be solutions 
of the system of differential equations 


-0a(- s ) + 2 ^(s)0 a (s) -f \0(s)\~il' X (s) + 0{s)4>x{s) = 0 (9.28a) 

-A[0 a (s) 4- 20(s)0 a (,s) + |^(5)| 2 0 a (s) + 6(s)i{'x(s)} + 0 A (s) = g(s ) (9.286) 
plus boundary conditions similar to those discussed in the previous case 
(continuity of \\ at the discontinuity points, etc). The determination of 

the parameter A can be performed using one of the methods discussed in 
Section 5. 


In practice, the most economical method for the computation of V x 
is perhaps the conjugate gradient method. Regularizing properties of this 
method 156 '’ f5,] can also be used in order to avoid the minimization of (9.26). 

In the previous treatment we have neglected the errors in the determi¬ 
nation of the contour which imply an approximate knowledge of the operator 
^ (Equation (9.8)). However, if the equation L\ ~ g -f- dg, where dg is the 
erior on the data, is replaced by the equation (L -f- dL)V — 9~\~dg, where dL 
is the error on the operator, it appears that the two equations are equivalent 
in the sense that only the error on g is different in the two cases (in one case 
it is dg and in the other case it is dg - (dL)V). This point of view assumes 
that the errors in the determination of the contour have been included in the 
errors on the data. 


9.3. Two-dimensional optical flow 

As we already recalled at the beginning of this section, Horn and Schunckf 39 ^ 
attempted to recover the optical flow in the entire image and not just on 
a one-dimensional contour. Their basic equation is Equation (9.1), which, 
written explicitly, provides the relationship 

VE • V = - d t E (9.29) 

where V E — {d x E, d y E} is the gradient of the brightness distribution in 
the image, V is the velocity field (optical flow), and d t E is the partial time 
derivative of the brightness. Therefore, a measurement of VE and d t E gives 
the component of V parallel to VE. 

We assume that the brightness distribution E(x,y,t) is defined in a 
bounded region Q whose boundary dtt is a contour with an everywhere con¬ 
tinuous tangent. Furthermore, we will also assume, for simplicity, that V E 
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is never zero in ft and that the level lines of E(z, y,t) have everywhere differ¬ 
entiable tangent and normal. We denote by t and n the tangent and normal 
to the level line at the point {.r,«/} 


t = VE 


d y E 

-d r E 


n = |V£j 


d x E 

dyE 


(9.30) 


Then the velocity field V(x,y) can be everywhere represented as follows 


F(.r,f/) = t ,T (.r, y)f v^(x,y)il. 


■L, 


(9.31) 


The problem can again be formulated as the inversion of a projection oper¬ 
ator: taking X — Y = L 2 (Q) 0 L' 2 (ft) and 


{LV){x, y) = v L (x,-y)n. 


(9.32) 


The data will be given by 


9(z,y) = 9(x,y)n, (9.33) 

where g(x,y) is the measured value of -d t E/\XE\. Then the set of solutions 
of the equation LV — g is the set of velocity fields 

V(x,y) = 4’(x,y)t+g(x,y)n (9.34) 

where ip is an arbitary function in L' 2 {Q). The generaHzed solution V’ f is 
trivial also in this case, since V + = g. 


9.4. A C-generalized solution for the two-dimensional optical 
flow 


As in the case of the optical flow along a contour, it is necessary to look 
for C-generalized solutions. The method suggested in [39]Horn and Schunck 
(1981) can be formulated in this framework. 

Introduce the constraint space Z = X 0 X and define an operator 
C : X h-* Z as 

" -(££)• (»•*> 

with the associated seminorm 

\\CV\\ 2 Z = / {d x v ■ d x V + d y V ■ dyV}. (9.36) 

J n 

Written in terms of the cartesian components of V this is just the integral 
of the quantity called the measure of the departure from smoothness in the 
velocity flow^ 39 ^. 

First consider the question of uniqueness. The null space N(C) is the 
set of the constant velocity fields, say V = a, while the null space N(L) is the 
set of the velocity fields which are orthogonal everywhere to n, i.e., V-n = 0. 
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The intersection is the set of constant velocity fields such that a ■ n = 0 and 

this condition cannot be satisfied by a ^ 0 if the level lines are not parallel 
straight lines everywhere. 

It is easy to verify that conditions (i) - (iii) of Section 3.2. are satis¬ 
fied and the existence of the solution is guaranteed. It may be interesting 
however to write the Euler equation for the C-generalized solution. After 
some lengthy but elementary computations, using the orthogonality rela¬ 
tions n • d x n = n • d y n = 0, t • dj = t • d y t = 0, d x n • d x i = d y n • d y f = 0 we 
obtain 



|V V ' 


\ CV Wz ~ j 4- \d y n\ 2 ^j \g \ 2 1 dxdy- f 

+ + \d y t} 2 ) |V’i 2 + 2 (n • d £ t) d x g + (n • d y t) d y g j/»-f 


+2 g [(f • d x f) d x v + (t • d y n) d y 4 ’ 


dxdy . 


(9.37) 


In order to find the Euler equation of the functional (9.37) one has to consider 
a variation of 0, V’ > V’ + h and put equal to zero the term of first order 
in h. Then, using the divergence theorem in order to eliminate the partial 
derivatives of h, transforming the fourth term in Equation (9.37) by means 
of the identities t- d x n = -n • d x t, t • d y n = -n ■ d y t, and using the fact 
that h is arbitrary, one finds that the unique function 4 ’ which minimizes the 
functional (9.37) is the unique solution of the boundary value problem: 


vv’ + (|<M 2 + M 2 W’+ 

+2[(j7 • d x t) d x g -T (n • d y i) d y g ] -f (n • A/) g ~ 0 


dil' 


= n 


an 


dt 


an 


(9.38) 

(9.39) 


~ ■ ' VS <1 ft 

where v is the normal to dft. Notice that this boundary value problem is 
just the extension in the 2-D case of the problem (9.19) with the boundary 
conditions (9.21). The boundary condition (9.39) can be replaced by the 
value of 4 1 if the tangent velocity can be measured on dVl. 


It is also easy in the present case to verify that if the motion is a pure 
translation (i.e., a constant velocity field), and if the data function is noise- 
free, then the O-generalized solution coincides with the exact velocity field 

It is also obvious that in such a case the C-generalized solution is ill- 
posed and that one must introduce regularized approximations. These can 
be obtained by minimizing the analogue of the functional (9.26), and this is 
precisely the method used in [39]. 
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10. Surface reconstruction 


Most algorithms able to recover depth from pairs of stereo images provide 
sparse depth \ allies, that is, depth is obtained only tor special points in 
the \iewed scene. Since a global description ot the 3D structure of the 
viewed scene is desirable, it is useful to consider the problem of recovering 
a mathematical representation of a visible surface f(x,y) from the sparse 
data^ 41 !. 

10.1. Surface interpolation 

The original data are a finite set of depth values z, = i - 1_, n 

(which are assumed to be exact; that is, noise-free) and the problem is the 
recovery of a smooth function / (.r, y ) interpolating z t at (.r,, y t ) = f, con¬ 
tained in fi. Grimson ^ j proposed to find f such that it minimizes the 
seminorm 



Uniqueness of solution is guaranteed by the existence of at least four non- 
coplanar points Z{ = /(.r,, y t ) f 40 M 41 l. 

This procedure can be seen as an application of generalized inverses in 
the case of discrete data (see Section 3.4): in this case, uniqueness of the 
solution is guaranteed when the intersection of the null space of C, (iV(C7)) 

and the null space of L{X{L)) is empty, where L is the operator defined in 
Section 3.4. 

The null space N{C) is composed of the set of functions f(x,y) = ax -f 
by + c with a , 6, c constants. These functions consist of all planar surfaces 
defined in Q. The null space J\(L) lias been defined in Section 3.4 and 
consists of the set of functions such that f{x n y,) = 0 for i = l ? . n> 
Therefore, it is easy to see that when i 4 and the points U = (.r,, y t ) 
are distinct, the intersection of N(C) and N(L) is empty. In other words, 

uniqueness is guaranteed if there are at least four non-coplanar points, as 
required in [40], [41]. 

10.2. Surface approximation with noisy data 

It is also useful to consider the case in which the data are noisy; that is, when 
the original data are gi = f(t t ) 4- e ir i = 1,... ,1V and s t is additive noise. 
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In this case, it is reasonable to look lor a solution close to the original data 
bllt 31:50 smoothf 39 ]’f 40 ]’f 41 ht 42 hf 43 l’f 44 l. This approach can be seen as an 
application of regularization theory. In Part ne we have already shown that 
interpolation is an ill-posed problem which can be solved by the use of a gen¬ 
eralized inverse. We will present now an approach to interpolation directly 
originating from regularization theoryt 58 ^ 59 ], which clarifies the relationship 
between splines, regularization theory, and gives a different framework to the 
results on visual interpolationf 40 l’f 41 h[ 42 ]’[ 43 b[ 44 ]i[ 4 5] > 


We can consider the case in which we want to estimate a smooth function 
t Q C i? 2 , given a finite number of observations of linear functionals 
of /. In the case of spatial interpolation, our functionals are: 

9i = Fi(f) + £i = /(/,•) -f £i i = l,...,n, (10.2) 

where s, is additive noise. A regularized estimate f n< \ is obtained by solving 
the minimization problem 


E(/('i)-0.) + A-U/), (10.3) 

1=1 ' ' 

in which J m {) is a seminorm in H m ( H m is a reproducing kernel Hilbert 
space of functions defined in fi) defined by 


J,n(f) = 


f /• + 00 m 

// SC) 

J ™ „ = 0 


d mf 


dx u dy 


m — v 


dxdy, 


(10.4) 


(here m indexes the highest square integrable derivative) and A controls the 
tradeoff between the degree of approximation of the solution to the data and 
the smoothness of solution. The value of A can be computed by the method 
of generalized cross-validatioiJ 21 ^ 22 !. If m = 2 we have the functional (10.1). 
The solution of this minimization problem is one of the “thin plate splines,” 
so called because Jz{f) is the bending energy of a thin plate. 


In [58] it, was shown that a unique solution exists for any A > 0 provided: 

(1) m > 1; 


(2) n > M = ( r " 2 +1 ): 

(3) the design ^i, • • •, t n is unisolvent, that is if is a basis for the 

M dimensional space of polynomials of total degree m - 1 or less, then 

«i/0i/(*t) = 0(i — 1,... ,n) implies that the a u = 0. 

If m = 2, then we need at least three points which do not lie on the 
same straight line (to satisfy the requirement of a unisolvent design), which 
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is the same requirement, as found in [40] and [41]. Moreover, the solution has 
an explicit representation^ 58 ’ as: 

Ml n 

fn , rn , A ( 0 =^C j E m {t,tj) + '*T d v<f>At), (10.5) 


where 





E,n{s,t) = 

= - *| 2 log \s - 

t\ 

with 




s = (*i,yi) t 

= U’2, U2) 

\s -t\ = ^/(x, - 

*c 2 ) 2 + {yi - 3/2 ) 2 

and 





o m = 1/ 

2 2m - 1 ;r[(m - l)!j 2 . 

(10.7) 

The coefficients c 

(c 1 , . . . , 

c n ) and d = ( d \,.. 

•, d n ) are determined by 


the solution of the algebraic linear system: 

(A" + p)C + Td = g 

T r c = 0 (10.8) 

where K is the n x n matrix with K jk = E m (ij,t k ), p = r?A, T is the n x m 
matrix with T ui = ^(/ t ) and g = (g u .. .,g n ). 


10.3. Surface interpolation on a regular grid 

While surface interpolation from sparse data requires an arbitrary grid of 
knots, other problems of machine vision require the approximation of a 3D 
surface through points given on a rectangular grid. For example, when a 
smooth function / interpolating intensity values on the regular grid of a 
CCD camera is regularized, it is possible to use doubly cubic splines or a 
tensor product of splines, giving an interpolating function that minimizes 

J j (d 4 f /dx 2 dy 2 ) 2 dxdy. (10.9) 

In tins case different kinds of doubly cubic splines can be used, according to 
the available data 60 L The algorithms are then convolution algorithms (see 
section 8.4). 
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11. Shape from shading 


It is a common experience to notice our ability to recover the shape of an 
object from its shading. Convexity or concavity of viewed objects are easily 
understood by looking at the profile of radiating light. Here we have another 
classical problem of early vision, “shape from shading,” which has stimulated 
elegant mathematical approaches. The problem of shape from shading was 
ini tally formulated in [61] and [62) as the solution of five ordinary differential 
equations called the characteristic strip equations. Of considerable use in 
this problem has been the reflectance map R(p,q) [ 63 M 64 1 which specifies the 
radiance of a surface patch as a function of its orientation, determined by 
the pair (p,q). If z(x,y) is the surface of the object, p and q are defined as 


P = 



and 


and the unit normal n to the surface is 



(H.l) 


1 


n 


V 1 bp 2 f 5 


= {-/b -9,1}- 


( 11 . 2 ) 


The reflectance map can be computed from the bidirectional reflectance- 
distribution function and the light source arrangement} 63 !. 

Formally, given an image E(x,y) and a reflectance map R(p,q), the 
shape from shading problem may be regarded as the recovery of a smooth 
surface z{x,y) satisfying the image irradiance equation 

^=*(£>!)=*(*«> ^ 
over some domain 12 of the image. Since there are two unknown functions 
{p and q) and only one equation the solution is not unique and the prob¬ 
lem is underconstrained (and ill-posed). Uniqueness of the solution can be 
recovered by the use of photometric stereo, which takes multiple images of 
the same scene from the same position with different illumination^ 65 !. In this 
approach, several equations of the type of Equation ( 11 . 3 ) are available, with 
different reflectance maps since the illumination source is different. Three 
different light sources can be used to obtain a unique solution. 

If only one source of illumination is available, uniqueness can be restored 
by variational techniques similar to those previously seen. Assuming that the 
object has a Lambertian surface and is illuminated by a planar wave of light 
(and the unit vector s — (s 1? t s 2 , S 3 ) points to the light source), then the 
Lambertian reflectance map becomes 

R{p, q) = n ■ s. 


(11-4) 
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If, instead of using the pair {p, 9 }, the new variables {/,#} are introduced 


/ = 


2 p 


2 ? 


1 r y/l+ p 2 + q 2 
the reflectance map becomes 


v ' 1 i P l 


R(f.9) 


4 ~(/ 2 + 3 2 ) 
4 + (/ 2 + g 2 ) 


(11.5) 


4/ 


4<7 


,1 


( 11 . 6 ) 


4 — (/ 2 + ff 2 )’ 4 -(P+gi) 

The problem of shape from shading can be formulated either using the un- 
known r7 or the pair {p, q} or {/,#}. 


11.1. The variational approach to shape from shading 

When the unknown u is used, the variational approach is to find n{x, y) such 
that it minimizes 

(E(**y) ~ n • s) 2 dxdy + \j (n 2 0 n 2 y )dx dy , ( 11 . 7 ) 

with the constraint |jn|| = 1 . In this case, the variational problem is quadratic 
in the unknown n, but the constraint ||n|| — 1 is unusual. 

When the pair {/,#} is used, we seek functions / and g minimizing: 

/ - R{f,g)\ 2 dxdy 0 A f (/* -fi f 2 +g\ 4- g 2 )dx dy, (11.8) 

J fi J n 

with R(f,g) given by Equation (11.6). The variational problem is not 
quadratic in the unknown {f,g} and the results of nonlinear inverse problems 
have to be used. 


11.2. Regularization of shape from shading 

We give an application of the result stated in Section 6 by formulating the 
problem in terms of the pair {p,q}. We define the space X as the direct sum 
L 2 (n) 0 L 2 { fi), i.e., u is a pair {p, q} of square integrable functions: 


a\ 


x 


/ P 2 (-r, y)dxdy 0 I q 2 (x,y)dxdy. 
Jn Jo 


(11.9) 


Let the space 1 be also a space of square integrable functions (we now call 
g{x,y) the image £(*,*/)), and from (11.2), (11.4) we define a nonlinear 
operator .4 : X —► I' as follows: 

•S3 - psi - qs 2 


(Aw)(x,p) 


a/1 0 p 2 + q 2 


( 11 . 10 ) 
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Because n and s are unit vectors, it is obvious that \(Au)(xy)\ < 1 for any 
{.r, (/} G ft. It follows that the domain of .4 is X and that the range of .4 is 
contained in the set of g(x,y) such that jp(«c, t/)j < 1 in ft. Furthermore, it 
is not difficult to prove that the operator .4 is continuous everywhere, i.e., if 
u is any element of X and if {ti n } is a sequence convergent to u , then Au n 
converges to Au. Indeed, using the inequalities 

1*3 - PS 1 - 9« 2 | < y/l +p 2 + v/1 + qi, > 1, (11.11) 

it follows that 


I Au - ,4w n | < 1^11 |p - pn \ + \s 2 

Then, using the inequality ( q l -r 
we get 


- Qn I + | y/l + P 2 T q 2 - Vl + P 2 n + Qn | • 

( 11 . 12 ) 

• -Aq n ) 2 S n(qj + . . . + <? 2 ) (with n — 2, 3), 


I-4 m - Au n I 2 < (|p - p n | 2 + \q- g,,! 2 ). (11.13) 

By integrating over ft we get the continuity of the operator ,4. 

Finally, we consider the constraint operator C defined by 


Cu\z = 



Co(p 2 + q 2 ) 4 - Cl (p 2 + Py + q 2 x + q 2 y )] dxdy 


(11.14) 


where C Q could take the value C c = 0 and give the stabilizer used by Ikeuchi 
and Horn. 


We can seek a solution to the problem of shape from shading by mini¬ 
mizing the functional 


/ \{Aii)(x,y) - g(x,y)\2dxdy + A||Cu|fi, (11.15) 

J n 

where the first term in Equation (11.15) is Equation (11.10) and the con¬ 
straint operator C is defined in Equation (11.14). Because the operator .4 
is continuous and the constraint operator has a compact inverse, the results 
presented in Section 6 indicate the existence of at least a local minimum of 
the functional (11.15). Furthermore, if g E R{A ), the u\ converges to an 
exact solution when A > 0. 
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12. Discussion 


The review ol early vision presented in Part Two shows that certain regular¬ 
ization techniques can be useful for a correct and sound “solution” of many 
vision problems. The key idea of all regularization techniques is to introduce 
a priori knowledge — or constraints — which have to satisfy the solution. 
Therefore we will have different solutions according to the assumptions we 
have made, that is our a priori knowledge about the world. 

Physical plausibility of the solution is the most important criterion in se¬ 
lecting a good solution. The decision regarding the choice of the appropriate 
stabilizing functional cannot be made judiciously from purely mathemati¬ 
cal considerations. A physical analysis of the problem and of its generic 
constraints plays the main role. Standard regularization theory provides 
a framework within which one has to seek constraints that are rooted in 
the physics of the visual world. Standard regularization, however, offers a 
restricted universe of possible constraints since only certain a priori assump¬ 
tions can be translated into the language of Tikhonov stabilizers. 

In our example of the computation of motion, the constraint of smooth¬ 
ness is justified by the observation that the projection of three-dimensional 
objects in motion onto the image plane tends, in a probabilistic sense, to yield 
smoother velocity fields t 37 '. In the case of edge detection the constraint of 
a small norm for the derivative of image intensity is directly justified by the 
bandlimiting properties of the optics. In the case of motion, however, and 
more dramatically in the case of surface reconstruction, the constraint of 
smoothness is not always correct. This suggests - as we will discuss later - 
that more general stabilizing functionals are needed to deal with the general 
problem of discontinuities. 

A method for checking physical plausibility of a variational principle is 
to use the Euler-Lagrange equation associated with the variational problem. 
In the computation of optical flow, the following sufficient and necessary 
condition has been obtained (see also Section 9) for the solution of the 
variational principle Equation (9.26), to be the correct physical solution: 

- d 2 V 

= * 12 -D 

where t is the tangent vector to the contour and V is the true velocity 
The equation is satisfied by a uniform translation or expansion and by 
rotation only if the contour is polygonal. Therefore, the smoothness principle 
will give correct results when (a) motion can be approximated locally by pure 
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translation, rotation or expansion, or (b) objects have images consisting of 
connected straight lines. In other situations, the smoothness principle will 
not yield the correct velocity field, but may yield one that is qualitatively 
similar and close to human perception^ 37 ^ 3 ^. In the corresponding case for 
edge detection (intended as numerical differentiation), the solution is correct 
if and only if the intensity profile is a polynomial spline of appropriate degree. 

From a more biological point of view, a careful comparison of the vari¬ 
ous “regularization solutions with human perception promises to be a very 
interesting area of research, as suggested by Hildreth’s work on the compu¬ 
tation of motion. For some classes of motions and contours, the solution 
of Equations (9.19) and (9.22) is not the physically correct velocity field. 
In these cases, however, the human visual system also appears to derive a 
similar, incorrect velocity field^ 37 ^ 38 ). 

It may be useful to remember again that Tikhonov stabilizers do not rep¬ 
resent the only way to regularize ill-posed problems. Different or additional 
constraints such as shape (monotonicity, convexity) have been proposed. 

The most obvious way to solve inverse ill-posed problems without requir¬ 
ing smoothness of the solution is to use Markov Random Fields as proposed 
in [36]. In this approach, discontinuities can be preserved introducing ap¬ 
propriate line processes [34 )’t 35 ^ and appropriate potential terms. A possible 
problem in this approach is the critical dependency of the solution on the 
parameters coupling the different Markov Random Fields. 

12.1. Stereo matching 

Not all inverse problems of early vision can be solved using the regularizing 
techniques introduced in Part One. For example, stereopsis, which is the 
process that computes depth from two images of the same scene obtained by 
two eyes or cameras, appears as an inverse problem that may be approached 
with standard regularization techniques. It turns out that this is, however, 
quite difficult. The critical problem in stereopsis is the correspondence prob¬ 
lem, that is, the matching of corresponding features in the two images. Let 
us consider the 1-D matching problem, by considering the intensity profile 
or some corresponding feature map — on conjugated epipolar linesf 67 !. 
In this case, the obvious way to match the right image R{x) with the left 
one L{x) is to find the disparity d{x) such that the two intensity profiles 
L{x) and R(x T d(x)) are as close as possible. We can formalize this in the 
following way: let us define an operator P R that depends on the image as 

PrI{x) R{x + f{x)). (12.2) 
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The disparity function that we want could be seen as the solution to the 
inverse problem: 

L{x) = P R d(x). (12.3) 

The operator in Equation (12.3) which has to be inverted depends on the 
data and is not known ci priori. This class of problems is not covered by 
the available mathematical results. We could still try to determine d(x) by 
minimizing 

||T(j*) - R(x + d(x))||. (12.4) 

A sufficient condition for the solution of (12.4) to be unique is that L^x) 
and R{x) are strictly monotonic functions of x. This is clearly a very re¬ 
strictive condition, almost never satisfied by real images. In general, the 
problem admits many solutions unless constraints are imposed on d(x). If 
we use constraints of the Tikhonov type, w'e can look for a solution d(x) that 
minimizes 

\\L(x) - R(x -b d(.r))|| -f A ||</'(*)|i. (12.5) 

The second term in (12.5) is the disparity gradient, which is thus introduced 
as a direct consequence of regularization methods. 

One important property of the disparity is that d(z) can be discontin¬ 
uous. Furthermore, there are often occlusions, that is regions in one image 
that do not correspond to any part in the other image. In this case, d(x) is 
not defined. 

Because of the presence of occlusions and discontinuities in the dispar¬ 
ity, Equation (12.5) does not provide a physically plausible solution. Equa¬ 
tion (12.5) requires d(x) to be continuous and differentiable. Equation (12.5) 
is, however, valid if the disparity gradient is strictly less than 2 (Julesz’ def¬ 
inition). in this case there are no occlusions and Equation (12.5) provides a 
physically plausible solution. 

Another problem with Equation (12.5) is that in many instances match¬ 
ing is not performed between the intensity profiles in the two images, but 
between features maps. In this case, L{x) and R{x) are not continuous 
functions of x. 


12.2. Pseudoinverses versus regularized solutions 


It may be useful to summarize some mathematical conclusions on the re¬ 
lationship between pseudoinverses and regularized solutions that may be 
relevant in early vision. 
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(1) When we are dealing with operators defined on Hilbert spaces A, Y 
of the type 

L : A' h-> 1', (12.6) 

we have three main cases: 

(a) if £ is injective, linear, and continuous, and the range of L is given 
by R(L) = 1 , the inverse problem is trivially well-posed because the 
inverse operator is continuous; 

(b) if L is not injective, and R(L) is closed, then by the method of pseu¬ 
dosolutions, the inverse problem becomes well-posed in the sense that 
the generalized inverse is continuous; 

(c) if R{L) is not closed, the use of pseudosolutions by itself does not guar¬ 
antee the existence and continuity of the inverse solution in the case of 
noisy data; then, we have to apply regularization methods. This case is 
the normal one in early vision problems, because noise is always present 
in the data. 

(2) When we are considering operators defined on finite dimensional 
spaces R n and R rr \ of the type 

L:R"^R m (12.7) 

we have again three main cases: 

(a) If p denotes the rank of the matrix associated with the operator L and 
P = n = m, then L is injective and R{L) = R rn . A solution always exist 
and is unique and the inverse problem is well defined. 

(b) If p < n, then uniqueness does not hold, but it can be restored by 
considering the generalized Moore-Penrose inverse. 

(c) If p < m, then existence does not hold for arbitrary data but it can be 
restored again by considering the Moore-Penrose inverse. 

For operators in finite dimensional spaces, the inverse or generalized in¬ 
verse is always continuous. Therefore with finite dimensional spaces, the 
use of generalized Moore-Penrose inverses is sufficient to guarantee well- 
posedness and the use of regularization methods is not strictly required. It 
is useful to remember again that well-posed problems can be ill-conditioned, 
<md m such a case it is necessary to use regularization methods as m the case 
of ill-posed problems. This is the case when the input data are very noisy 
and when differentiation on the input data is required, as m the recovery of 
optical flow or edge detection. Finally, it is useful to observe that: 

(3) The distinction between interpolation and approximation of discrete 
data is associated with the use of pseudoinverse solutions or regularization 
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methods. Regularization methods are intrinsically approximating solutions, 
while pseudosolutions can be seen as interpolating solutions through the 
original data. 


12.3. Learning regularization algorithms from examples 


12.4. Limits of regularization methods in vision 


We believe that regularizing methods provide a useful and mathematically 
sound framework for many problems in early vision. Not all problems in 
early vision, however, can be easily treated in this framework (e.g. stereo 
matching). Moreover it is often desirable to have solutions that preserve 
discontinuities. In this case Markov Random Fields formulations' 36 ! are likely 
to represent more powerful tools. 

Beyond early vision, efficient vision systems are likely to use more com¬ 
plex and elaborate a prion information and to be equipped with reasoning 
capabilities which encompass early vision and regularization methods. It is 
possible, though as yet unclear, that between early vision and high vision 
different sources of low-level informations are integrated into a unified repre¬ 
sentation of surface properties, such as the 2 \D sketch. At this point MRF 
models could be quite useful in providing a flexible (though complex) tool 
for sophisticated integration. 
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13. Appendices 


13.1. Appendix A. 


For the convenience of the reader, we summarize in this appendix some basic 
ideas of functional analysis which are used in the paper. 

All the questions of existence, uniqueness, and continuity of the solution 
have a precise meaning (and a precise answer, when the answer is known) if 
we carefully define the sets X and Y to which the functions m, g (Equation 
(3.1)) belong. In particular, continuity requires that we define what we mean 
by the "vicinity” of two functions in X or Y; i.e., we must introduce a metric 
in both spaces. 

Among various possible choices oi metrics, the one corresponding to a 
Hilbert space is the most simple and interesting, since a Hilbert space is the 
space most similar to the usual Euclidean space. 

A Hilbert space X is a linear space of functions, satisfying the following 
conditions: 

(a) For any pair of functions u.r t£ X (with * being complex conjugate, a 
complex (real) valued function {u,v) x , called the scalar product in X , 
is defined such that: 


(i) (tt, a)x > 0, — 0, if and only if u = 0: 

(ii) ( a , v) x = (<’, u)* x \ 

(iii) (Am -f fdi\ z) x = A(m,~)y Y u(l\z)x for arbitrary complex (real) 
numbers A, /l/; 

(b) X is complete; i.e., the Cauchy criterion for the convergence of sequences 
holds true; 


(c) X is separable; i.e., there exists a countable orthonormal basis. 

The classical example ot a Hilbert space is provided by any space of 
square integrable functions (T 2 -spaces), the scalar product being defined by 


ju U 


(u,u)x = I u[x)v*(x)dx. 

A norm can be introduced in A" by means of the relation 

IMI.y = («» u )x\ 

and it satisfies the properties: 

(i') ||m||x > 0,= 0 if and only if u = 0; 


(.u) 


(- 12 ) 
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(ii') A u x — |A111tt|j_y for any complex (real) A; 

(iii') ||a -f o11 y < ||m|| Y ||'i’ll y (triangle inequality). 


Then the distance p( u , v) from u to v is defined by 


"OmO = || M- ~ 1’IIy- (-43) 

A real-valued functional p(u), defined on X, is called a seminorm on X, if it 
satisfies properties (ii') and (iii^) of the norm. Then it follows that p(0) = 0 
and p{u) > 0 but the condition p(u) = 0 does not necessarily imply u = 0. 
The set of the functions such that p(u) = 0 is a linear subspace called the 
null space of p, i.e., it contains the zero of X, and, if it contains u,v, then it 
contains also Aa + pe, for any complex (real) A, p. An example of a seminorm, 
which is not a norm, is the following: 


PM = (J | u'(x)\ 2 dx^ 


1/2 


The null space of p( u) contains all the constant functions. 


(-44) 


An operator from a Hilbert space X into a Hilbert space Y is defined 
by a mapping which transforms functions of a subset of A r (the domain of 
the operator, denoted by D(L)) into functions of a subset of Y (the range 
of the operator, denoted by R(L)). When the domain is a linear subspace 
and the mapping is linear, the operator is called linear. We use the notation 
L : X —+ Y for denoting a linear operator from X into Y. 

If D(L) = A" and if, for any sequence u n converging to u, the sequence 
Lu n converges to Lt/, then I is a continuous operator. A linear operator L 
is continuous if and only if it is bounded, i.e., there exists a constant C such 
that, for any u 6 -Y 

IIHIr < C\\n\\ x . (-45) 

The quantity 


sup 

L U y 

u ex' 

u x 


is called the norm of the operator L. 

A linear continuous operator L : X —► Y always admits an adjoint 
operator L* : Y —> X , defined by 

{Lu, r)r = (u,L*v)x (-47) 

for any u G -Y, c 6 1 . The operator L* is also linear and bounded and 

\\L* || = ||L||. 


Tsing the definition of the adjoint operator, we can write Equation (A6) 
in the following form 


r sup (L*Lu,u) Y 1 1 2 
1 u G A ( u , u ) y * 


(AS) 
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therefore, ||Xj is the square root of the supremum of the spectrum of L L. 
Here we have used a result which is an extension of the well-known variational 
property of the maximum eigenvalue of a matrix. 

A particular class of linear operators is provided by the linear function¬ 
als, which correspond to the case where Y is the space of the real (complex) 
numbers. Therefore, a functional is an operator which associates numbers 
to elements of X. Linear continuous functionals are characterized by the 
Representation Theorem of F. Riesz: let F(a) be a continuous functional on 
X , then there exists a unique function (p £ X such that 

F(u.) = (u,<j>) x (.49) 

for any u £ X. 


A linear operator L : X —> Y is said to be compact (or also completely 
continuous) when it is bounded and transforms any bounded set of A' into a 
precompact set of Y (a precompact set is a set whose closure is compact; i.e., 
it has the Bolzano-Weierstrass property). Compact, operators are interesting 
since they are the most similar to matrices. 


In order to establish the compactness of an operator, one needs com¬ 
pactness criteria in functional spaces. The basic result is the Ascoli-Arzela 
Theorem, whose proof is the paradigm of all the proofs of compactness. 
Ascoli-Arzela’s theorem states that a sequence of continuous functions w„(.r) 
is precompact if: 


(i) The functions u n (x) are uniformly bounded; i.e., there exists a constant 
C such that 

l««(z)l A C (A10) 

for any n and any x. 

(ii) The functions u n (x) are uniformly continuous; i.e., for any s > 0 there 
exists 8 > 0 such that 

\u n {x) - u n (x ')| < s (All) 

for any .r, x' with \x — x'\ < 8 and any n. 


A very simple example of a compact operator in a L 2 -space is provided 
by an integral operator 

{Lu){x) = J K{x,y)u(y)dy, (A12) 

whose kernel h(x,y) is square integrable 

jdx 1 dy\K(x,y)\ 2 < +oo. 


(-113) 
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The most striking property of compact operators in Hilbert spaces is 
that they have a singular value decomposition (SVD) similar to that of a 
matrix. The singular system of a compact operator is defined as the set of 
the solutions of the coupled equations 

Lu k = a k v k , L*v k = a k u k , (.414) 

where the <\ k (the singular values) are positive numbers and the u k ,v k (the 
singular functions) are functions in X and Y respectively. 

When L is compact, its singular system always exists and has the fol¬ 
lowing properties: the a k have finite multiplicity and tend to zero when 
k —> oo (we exclude here the case of a finite rank operator); the u k form an 
orthonormal basis in the orthogonal complement of N(L) and the v k form an 
orthonormal basis in the orthogonal complement of N{L*), i.e. the closure 
of R{L). 


It is easy to verify that a function g G Y is in the range of L if and only 
if (Picard conditions) 


gtN(L*y L 


E 


\(9,vk)y\ 2 



< Too. 


(.415) 


Since a. k —> 0 when L is not a finite rank operator, it is clear that an arbitrary 
function orthogonal to N(L*) does not always satisfy the second condition 
in Equation (A 15) and therefore R(L) is not closed. 


We conclude that the problem (3.1) with L compact is ill-posed. 

On the other hand, the range of a finite rank compact operator is closed 
since its dimension is finite. 


Another example of continuous operators with closed range is provided 
by the projection operators; i.e., linear operators P such that: 

(i) P* = P ; 

(ii) P 2 = P. 

It follows that j|P|| - 1. Furthermore, N(P) is the set of all the functions 
u = (I -P) v i where v is an arbitrary function of A r and R(P) is the set of all 
the functions u such that u = Pu. Therefore R(P) is closed. A projection 
operator is compact if and only if the dimension of R{P) is finite. 



13.2. Appendix B 


In this Appendix we will outline the proof of the main result stated in Section 
3.3; namely, that if the constraint operator C satisfies conditions (i) — (iii), 
there exists a unique C-generalized solution t/ + for any g such that Pg £ 
LD{C). However, we first show that conditions (ii) and (iii) are satisfied by 
seminorms defined in terms of differential operators. 

Consider in £ 2 (0,1) the following seminorm: 

?(«) = (/ \u^ dx ) , (51) 

where = d k u/dx k . The domain of the operator Cu = u^ k) is the set 
of all functions u which have square integrable derivatives at least up to 
order k. Therefore, C is not everywhere defined in T 2 (0,1) and is not con¬ 
tinuous (bounded). However, a differential operator such as C is a typical 
example of a closed operator, i.e., of an operator satisfying the following 
conditions^ 33 ': if {«„} C D{C) is a sequence convergent to u and such that 
{C*n} is a sequence convergent to v £ Z, then ii £ D(C) and Cu = v. 
Furthermore, in our specific case, R(C) = L 2 { 0,1) since, given an arbitrary 
function v £ L 2 (0,1), there always exists a function u £ D{C ) such that 
id ^ = i\ Therefore, C satisfies condition (ii). This condition implies that 
C has a bounded generalized inverse C + since C + is just the inverse of the 
restriction of C to N(C) L (the inverse of a closed operator is also closed and 
an everywhere-defined closed operator is bounded). 

As concerns Condition (ii), notice that N(C) is the set of all the polyno¬ 
mials of degree S k — 1. Therefore, N(C) is a k -dimensional closed subspace. 
It follows that, whenever I is a linear, continuous operator defined on X 
with its range in an arbitrary Hilbert space Y, LN(C) is a ^-dimensional 
closed subspace in Y. 

Finally, as concerns Condition (l), it. is satisfied whenever iV( L ) contains 
no polynomials of degree < k — 1. 

The proof of the result stated in Section 3.3 is based on the fact that 
conditions (i) - (iii) imply the following one: there exists a constant fS 2 such 
that, for any u £ D(C) 

i|£«i|y + \\Cu\\ 2 z > /5 2 ||w|| 2 y . (B2) 

This condition is called by Morozov the completion condition^ 27 '. Suppose 
we define on D(C) the scalar product 

(u, e) 0 = (Lu,Lv) y + (Cu,Cv) z . {B 3) 


55 


Then from Condition (i) it follows that ||u|| 0 = 0 implies ||«|| = 0, and 
D(C ) is complete in the topology induced by this scalar product, i.e., D(C) 
becomes a Hilbert space. Condition (B2) indeed implies that any Cauchy 
sequence in the topology of D(C) is also a Cauchy sequence in X. 

In order to prove inequality (B2), we introduce the space W = Y © Z 
and define the operator B : X •—> W as follows: 

Bu = {Au,Cu}, u e D(C). (B4) 

Then, since B is closed, inequality (B2) is equivalent to stating that. B has 
an inverse B~ l and that R(B) is closed in XV. 

The existence of B l is an easy consequence of Condition (i). In order 
to prove that R{B) is closed, let {a n } C J 0(C) be a sequence such that 
{Bu n } is convergent in W\ we must prove that the limit belongs to R(B). 
Since {Bu n } is convergent in W, it follows that {Au n } is convergent in Y 
and {C u n } is convergent in Z. Put u n = u ( n 0) + with u ( * ] £ NIC) and 

Un e N(C) L . As already remarked, the restriction of C to N(C) 1 - has a 
bounded inverse and therefore there exists a constant 7 2 such that 

\\Cu n \\z = \\Cu' n '% - 7 2 ||^ 1) llz- (B5) 

This implies that. { u n } is a Cauchy sequence and therefore it is convergent. 
Let u (1) be the limit. Since the operator C is closed, u (1) € D(C ) and 
{Cu n } converges to Cu {1 \ Now we have Lu n = Lu n 0) © Lu ( n 1] and both 
{Lu n } and {Lu n j- are convergent. It follows that {Ti/n ^ j- is also convergent, 
and, thanks to the closure of IiY(C), there exists «. (0) £ N{C) such that 
Lu n converges to Lu {0) . By combining all the results we have that {Bu n } 
converges to B{u {0) + u (l) ) and therefore R(B) is closed. 

Now, starting from the completion condition (B2), the proof of the ex¬ 
istence of the C’-generalized solution for any g such that Pg £ LD{C) can be 
done as in [2 ']. The proof is just an easy extension of the proof of the classical 
result that any closed and convex set has a unique element of minimal norm. 
Notice that in [27], the C-generalized solution is called the solution of the 
basic problem. 

When the operator C has a bounded inverse C *, conditions (i) - (iii) 
are obviously satisfied. In such a case, the C-generalized L+ is given in [3]: 

L^ c = (B6) 

where ( LC " 1 ) + is the generalized inverse of the operator LC~ A : Z —> Y. 

An example of an operator C satisfying this assumption is the following: 
take X = Z 2 (0,1) and Z = T 2 (0,1) © T 2 (0,1) and define C by 

Cu — {u, u'} 


(B 7) 
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with domain the set ot the absolutely continuous functions with square inte- 
grable first derivative. Then, 



and we have a functional of the type (5.4). 


(B 8 ) 
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13.3. Appendix C 


In this appendix we show that the problem of linear interpolation is equiva¬ 
lent to the computation of a generalized solution in a suitable space. 

Let A" be a space of differentiable functions, defined on the interval [0,1] 
and having a square integrable first derivative. X is an Hilbert space if we 
define a scalar product by means of the formula 


(u<v) x = u(0)v(0) + f u'(x)v'(x)dx. (Cl) 

Jo 

Let x £ [0,1] be a fixed, arbitrary point; then, from the elementary relation 

u(x) = u(0) + f u'(x')dx' (C2) 

Jo 


it follows that 


“(*) = (UjQx)x, (C 3) 

where 

Qx(*') = 1 + min {.c, a*'}. (C4) 

Clearly Q x £ X for any .r, and therefore all the evaluation functionals 
(i.e., the functionals which associate to a function u its value in a given point) 
are continuous. 


A Hilbert space of continuous functions having the previous property is 
called a reproducing kernel Hilbert space. The reproducing kernel Q(. r,.r') 
is defined by 

<?(*,*') = Qx(x') = Qx'{*)i (C 5) 

and its name is due to the relation 

{Qx,Qx')x = Q(xjx'). (C6) 

Assume now that a function u £ X is specified at the points x u x 2l ...,x N 
(• r n C [0,1]) and let g x , g 2 ,..., gjy be its values. The interpolation problem 
(i.e., find u £ X such that u(x n ) = g n for n = 1 ,..., N) can be formulated, 
thanks to (C3), as the problem of determining u £ X such that 

{uiQx n )=g n ; n — 1,..., N (C 7) 

and therefore it takes the form (3.12). (3.13). If we recall that the generalized 
solution is orthogonal to N(L) (Section 3) and that N(L) is the orthogonal 
complement of the subspace spanned by the functions 

= Q(^’n,.c) (C8) 

(L is defined as in Equation (3.14)), we conclude that the generalized solution 
must be a linear combination of the functions <f>„ 

N 

u + {x) = y^c ra Q(ar„, x). 

n=l 


(C9) 
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From Equation (C4) it follows that </ + (.r) is just the linear interpolation of 
the data g n . 

Interpolation by means of splines of degree m — 2k - 1 (k "> 1) can 
be obtained along similar lines by a suitable definition ot the reproducing 
kernel Hilbert space A" [l2] . Interpolation by means of natural splines of 
the same degree^ 13 ! can be formulated as the problem of determining, in the 
same space, a O-generalized solution which minimizes the seminorm (Bl). 
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